Open udif opened 1 year ago
I could maybe see supporting the old-style Verilog `protect directive, maybe hidden behind a switch or something. The comment-based directives are beyond terrible and really should not exist. Compilers should not be parsing comments for directives, and doing so in slang would not be particularly easy or performant anyway.
In my case, `protect
and `endprotect
are less of an issue because I simply treat them as macros and define them beforehand as: -D protect= -D endprotect=
If I had to deal with `protected
and `unprotected
with encrypted content, this would have been a different story.
Unfortunately, the other blocks contain an already-encrypted // pragma protect
section with base64 garbage.
If I simply let slang ignore the pragmas as comments, I'm afraid it will run into the base64 data and fail there.
If I cannot incorporate this, my best shot would be to fork tools/driver.cpp
and somehow get a hook into Driver::parseAllSources
so I could insert my own filtering code after the readSource()
loop , but before SyntaxTree::fromBuffer()
.
Doing that would require already having a hook in driver/Driver.cpp
(lets say a function pointer that is called if it is non-null).
The advantage of this approach is that I can use the original slang static lib itself without any changes, and just clone my own tools/driver.cpp
.
This way I wouldn't need to subclass Driver
just to replace Driver::parseAllSources
, a 150-line function, just to add a function call between readSource
and SyntaxTree::fromBuffer()
I'm open to modifications to Driver to make it easier to extend for you. Could you also solve this with a pre-pass regex-based script that filters everything between the // pragma protect comments?
I've given this some thought over the last week, after I decided implementing more trivial changes first (#620).
While my original direction of doing search/replace on SourceBuffer
will probably work in my case, in the general case it is not robust:
module top;
wire [17*8-1:0]str = "\
// pragma protect\
";
endmodule
If I blindly search/replace the "// pragma protect" string, I might change unrelated strings.
Ideally I would like to be able to install a filter on the token stream, probably Preprocessor::consume()
, provided I can do it with zero cost in the regular case.
Yeah, I think it will be tough to do as that's a very hot piece of code. I think it's probably ok to add one well-predicted branch to the Preprocessor::next() function, which would check whether there's a filter installed and if so redirect the token out to the filter. Then you need some mechanism to not output the result but instead collect tokens until you see an end marker and mark the whole bunch of them as skipped trivia to attach to the following token. You also need a way to call back into the Preprocessor to invoke the pragma decoding logic since the text might otherwise not even be valid tokens.
I want to look into adding a check at the end of Lexer::scanLineComment()
and optionally call a user function if it is not null.
I'm thinking about trying such a definition in Lexer.h
:
struct SLANG_EXPORT LexerOptions {
...
void (*postLineCommentProcessPtr)(Lexer *lexer);
}
...
class SLANG_EXPORT Lexer {
...
void friend postLineCommentProcess(Lexer *source);
...
}
Then in Lexer.cpp
void Lexer::scanLineComment() {
...
addTrivia(TriviaKind::LineComment);
if (options.postLineCommentProcessPtr)
(*options.postLineCommentProcessPtr)(this);
}
And in my main code I should define postLineCommentProcess(Lexer *source)
somewhere and then do somewhere
postLineCommentProcessPtr = postLineCommentProcess;
. The postLineCommentProcess()
should check if the current Trivia contains // pragma data_block
and if so, consume all Tokens, skip the Base64 section, until encountering // pragma end_data_block
etc.
The code could maybe use std::optional
or std::function
but I need to read about those first.
The original reason I wanted to declare postLineCommentProcess
as a friend and not a class member was because I thought it had to be outside the class so the user could supply his own function.
However, on a 2nd though there is no reason a member function cannot be defined optionally outside the class. As long as you don't call or reference it directly, the class will compile cleanly.
On a 3rd thought I wasn't sure I can do the function pointer trick on a member function, But after giving it some further reading and saw you can easily define a member function pointer:
void (Lexer::*postLineCommentProcessPtr)();
and use C++17 std::invoke(postLineCommentProcessPtr, this)
instead of the function pointer call inside Lexer::scanLineComment()
.
I don't think you need a member function pointer here, the custom method isn't going to live inside Lexer itself. You can make use of function_ref
which is a lightweight wrapper around a callable defined in Function.h. You'll have to be careful about what you do inside the callback though because some of the Lexer methods can suddenly become reentrant if you start calling back into it while in the middle of an initial lexTrivia call.
The reason I wanted a member function is because I need access to the private Lexer methods.
I think that's kind of abusing the encapsulation though. This is a downstream client method with logic outside the class, so if it wants to do things with the Lexer, that functionality should be made public and made robust for that usage.
Raising this issue from the dead...
Coming back to this after I began scanning a different, more ancient codebase, with those dreaded // pragma
comments...
Would you be accepting a patch that would modify this (and it's header file as well):
void Lexer::scanLineComment() {
...
into:
SLANG_VIRTUAL void Lexer::scanLineComment() {
...
And have SLANG_VIRTUAL default to be empty? This would keep zero overhead for normal case, but would give me the power to override it if I HAVE to abuse the lexer and willing to compile with non defaults. I'm even investigating to go as far as this to make it work while keeping all the dirty work on my side.
I think your original idea here was better, to pass a comment scanner callback through the LexerOptions. I think you should code up what logic the callback will actually do to implement the functionality you want, and then we can massage the Lexer interface a bit to make that possible without needing vtables or other hackery of the private internals.
I went over the latest tree and it seems slang now supports the standard
`pragma protect
set of keywords for protected IP. Unfortunately, I'm still encountering IP blocks using 2 legacy mechanisms :This dates back to Verilog-XL days, but was also supported by other tools. After encrypting source code tagged as above it is turned into:
Another protection mechanism is a Cadence-specific variation on IEEE1735 that uses language specific
pragma
comments (// pragma
for verilog and '-- preagma' for VHDL) :This one is very similar to the standard IEEE1735 encryption, but it contains a few more keywords (
end_data_block
,end_key_block
,end_digest_block
). Given the similarity, this might have been an IP donation from which IEEE1735 evolved, but I am only guessing. I think it can be implemented by slightly extending the current encryption support.My question is whether would you be willing to support these.