Open karimtera opened 1 year ago
These parameters look good to me. @hzeller @fangism please comment if you have some additional ideas.
One thing that we should consider early on is how to handle conditionals in the code, here are some notes from my discussion with @mglb
Consider following code:
module mod_name (input bus_a, output bus_b
`ifdef USE_PG_PIN
, input pg_pin
`endif
);
endmodule
To parse this into a tree, Verible could preprocess the code and generate two sources: one for condition being true, one for condition being false. Both sources would be parsed into alternate trees. The trees would be then merged to share common branches. Alternate branches would be wrapped in a dedicated node. The preprocessor lines (and inactive in a given variant conditional code blocks) could be treated like comments for parsing and formatting purposes.
What is needed
Note that the integration with linter and formatter can be done at a later step, but we have to keep in mind that this is the ultimate goal so we have to think about how to make that as easy as possible with the implementation we're doing
And in general, the standalone tool should take raw systemverilog sources as input and ouput preprocessed code (only one branch of the conditional code based on the defines, macros should be resolved etc).
I suggest we start working on this one by one, e.g: pick the first feature you want to work on - I think conditionals are a great candidate - scan the tree for all conditional statements, create the trees as suggested above (or suggest an alternative solution), as the last step, look up all the defines (you can start with defines provided from the commandline and store info from `define XXX
as the next step) and output only the selected branch in the final code.
Hello All,
I am currently working on conditionals feature in the standalone tool PR #1360, I am facing a problem, and would like to know what you think about my solution before implementing it.
Problem:
The verilog::VerilogPreprocess
class has its only interface (public) method ScanStream
which operates on verible::TokenStreamView
that doesn't contain any white-space characters.
Using this method allows the tool to correctly choose the matching branch if a conditional is encountered, however it makes it hard to preserve the original white-space characters to output the SV source code as it was before preprocessing.
This problem will be even harder when ExpandMacro
method is merged into Verible.
Solution:
verilog::VerilogPreprocess::Config
field called bypass_whitespaces
that allows verilog::VerilogPreprocess
to operate on token stream-views containing white-space characters by adding this behavior to methods that handle each token (not so smart? but works).Do you have any suggestions? Also, is there any existing utility that can help in this situation?
Thanks
Hello All,
I am currently working on conditionals feature in the standalone tool PR #1360, I am facing a problem, and would like to know what you think about my solution before implementing it.
Problem: The
verilog::VerilogPreprocess
class has its only interface (public) methodScanStream
which operates onverible::TokenStreamView
that doesn't contain any white-space characters.Using this method allows the tool to correctly choose the matching branch if a conditional is encountered, however it makes it hard to preserve the original white-space characters to output the SV source code as it was before preprocessing.
This problem will be even harder when
ExpandMacro
method is merged into Verible.Solution:
- Adding a
verilog::VerilogPreprocess::Config
field calledbypass_whitespaces
that allowsverilog::VerilogPreprocess
to operate on token stream-views containing white-space characters by adding this behavior to methods that handle each token (not so smart? but works).- Having the original SV code token stream-view (with white-spaces), and the preprocessed token steram-view, a two pointers algorithm can be implemented to solve this situation (smarter, but would be a problem in case of expanded macros with arguments).
Do you have any suggestions? Also, is there any existing utility that can help in this situation?
Thanks
Hello!
If you need to preserve whitespaces between tokens, but are only given the non-whitespaces tokens, you can always "restore" the gaps by constructing a string_view
that spans the gaps between tokens with ranges like prev.end(), next.begin()
.
And in general, the standalone tool should take raw systemverilog sources as input and ouput preprocessed code (only one branch of the conditional code based on the defines, macros should be resolved etc).
I suggest we start working on this one by one, e.g: pick the first feature you want to work on - I think conditionals are a great candidate - scan the tree for all conditional statements, create the trees as suggested above (or suggest an alternative solution), as the last step, look up all the defines (you can start with defines provided from the commandline and store info from
`define XXX
as the next step) and output only the selected branch in the final code.
I added a new command for the tool generate-variants
which builds the control flow tree, and traverse it simply by dfs to generate and output all possible variants.
Example:
`define BAZ 1
`define MAZ 1
`ifdef BAZ
module foo(); endmodule
`ifdef MAZ
module moo(); endmodule
`else
module not_moo(); endmodule
`endif
module foo2(); endmodule
`else
module not_foo(); endmodule
`endif
module m(); endmodule
These variants are ready to be passed to the parser without any needed modification.
Thanks,
Hello All,
I am currently working on conditionals feature in the standalone tool PR #1360, I am facing a problem, and would like to know what you think about my solution before implementing it.
Problem: The
verilog::VerilogPreprocess
class has its only interface (public) methodScanStream
which operates onverible::TokenStreamView
that doesn't contain any white-space characters.Using this method allows the tool to correctly choose the matching branch if a conditional is encountered, however it makes it hard to preserve the original white-space characters to output the SV source code as it was before preprocessing.
This problem will be even harder when
ExpandMacro
method is merged into Verible.Solution:
Adding a
verilog::VerilogPreprocess::Config
field calledbypass_whitespaces
that allowsverilog::VerilogPreprocess
to operate on token stream-views containing white-space characters by adding this behavior to methods that handle each token (not so smart? but works).Having the original SV code token stream-view (with white-spaces), and the preprocessed token steram-view, a two pointers algorithm can be implemented to solve this situation (smarter, but would be a problem in case of expanded macros with arguments).
Do you have any suggestions? Also, is there any existing utility that can help in this situation?
Thanks
Hello!
If you need to preserve whitespaces between tokens, but are only given the non-whitespaces tokens, you can always "restore" the gaps by constructing a
string_view
that spans the gaps between tokens with ranges likeprev.end(), next.begin()
.
Sorry, I didn't understand your idea, can you elaborate more?
Thanks
Hello All,
I am currently working on conditionals feature in the standalone tool PR #1360, I am facing a problem, and would like to know what you think about my solution before implementing it.
Problem: The
verilog::VerilogPreprocess
class has its only interface (public) methodScanStream
which operates onverible::TokenStreamView
that doesn't contain any white-space characters.Using this method allows the tool to correctly choose the matching branch if a conditional is encountered, however it makes it hard to preserve the original white-space characters to output the SV source code as it was before preprocessing.
Hello! If you need to preserve whitespaces between tokens, but are only given the non-whitespaces tokens, you can always "restore" the gaps by constructing a
string_view
that spans the gaps between tokens with ranges likeprev.end(), next.begin()
.Sorry, I didn't understand your idea, can you elaborate more?
Between any two tokens A, B that belong to the same memory buffer (which is true in Verible), you can construct a string_view
of the characters between them, using one of these constructors:
https://en.cppreference.com/w/cpp/string/basic_string_view/basic_string_view https://github.com/abseil/abseil-cpp/blob/master/absl/strings/string_view.h
It appears the two-iterator range constructor isn't available until C++20, so something like this will work:
// assuming A is before B in the same memory buffer
absl::string_view text_between_A_and_B(A.end(), std::distance(A.end(), B.begin()));
The range between tokens can contain preprocessor directives and inactive code, so it can't be used in all cases. Example:
····module foo(); endmodule/*1*/
`ifdef MAZ
····module moo(); endmodule
`endif
····/*2*/module foo2(); endmodule
For a variant where MAZ
is not defined, /*1*/
and /*2*/
are consecutive tokens separated by 2 line breaks and 4 spaces, but the text between them is not just a whitespace. To make things worse, part of the whitespace is attached (in the text) to /*1*/
, and the other part to /2/.
- Adding a
verilog::VerilogPreprocess::Config
field calledbypass_whitespaces
that allowsverilog::VerilogPreprocess
to operate on token stream-views containing white-space characters by adding this behavior to methods that handle each token (not so smart? but works).
Do you mean you want to add a new token type (token_enum
value in TokenInfo
) that represents whitespace? Or just include whitespace as a part of existing tokens?
If the former, why does it need a config? It sounds good, whitespace tokens could probably be handled similarly to comments.
If the latter, it could work if handled in other places (might be a lot of work though).
- Having the original SV code token stream-view (with white-spaces), and the preprocessed token steram-view, a two pointers algorithm can be implemented to solve this situation (smarter, but would be a problem in case of expanded macros with arguments).
I'm not sure I understand how would it work. Could you elaborate?
Hello @mglb,
Thanks for your feedback!
Do you mean you want to add a new token type (
token_enum
value inTokenInfo
) that represents whitespace? Or just include whitespace as a part of existing tokens? If the former, why does it need a config? It sounds good, whitespace tokens could probably be handled similarly to comments. If the latter, it could work if handled in other places (might be a lot of work though).
I meant the latter, and yes It's a lot of work. But, I am still figuring out how to do your first suggestion.
- Having the original SV code token stream-view (with white-spaces), and the preprocessed token steram-view, a two pointers algorithm can be implemented to solve this situation (smarter, but would be a problem in case of expanded macros with arguments).
I'm not sure I understand how would it work. Could you elaborate?
I meant to have a loop with 2 pointers (initially pointing to the beginning of both streams), let's assume:
ptr_ws
is pointing to the stream with white-spaces, while ptr_pp
is pointing to the preprocessed stream.
The idea was to:
ptr_ws
is pointing to a white-space then we output this white-space.This is okay just when the non white-space tokens remain unchanged in both streams, but in case of an expanded macros, It might be a challenge.
Hi @fangism, @mglb,
I added a comment in PR #1360 that describes my approach to the white-space problem we discussed.
I would appreciate if you can take a look, Thanks!
I added a comment in PR #1360 that describes my approach to the white-space problem we discussed.
I would appreciate if you can take a look, Thanks!
I'll look at it closely tomorrow
BTW, for finding some interesting examples and 'good to evil' uses of the verilog preprocessor, always good to have a look at Wilson's paper https://veripool.org/papers/Preproc_Good_Evil_SNUGBos10_paper.pdf It might give some ideas for test-cases. (Wilson is the person behind Verilator).
I read it when I was getting to know the project, didn't benefit from it a lot then. I will take another look now.
Standalone Preprocessor Tool Design
Introduction
The goal of this thread is to discuss and put the outlines of the standalone preprocessor tool.
The preprocessor tool should fully support the directives described in 2018 SV-LRM, and provide APIs that can be easily used by other Verible tools.
This proposal is a continuation on the already built pseudo-preprocessing tool.
Flags description
Using the Abseil flags library allows an easy access to flags and their arguments.
Positional arguments
Those mainly are the SV files to preprocess, defines and search-directory paths for includes. Any of the following positional argument can be passed one or more times.
`define foo text
`include "file_name"
(IN-PROGRESS)