MikePopoloski / slang

SystemVerilog compiler and language services
MIT License
554 stars 120 forks source link

Increased compat vcs for constraint dist code #767

Closed svenka3 closed 1 year ago

svenka3 commented 1 year ago

Is your feature request related to a problem? Please describe.

This is not a bug in slang, instead it is LRM non-compliant code below:

class bug_c;
  rand int val;
  constraint cst_sum { 
    val dist {1 : = 10, 4 := 20};
  }
endclass : bug_c

Can you please allow it under --compat vcs as that compiler allows this code.

Also - hope this will be allowed in pyslang

Thanks

MikePopoloski commented 1 year ago

To be clear, the ask here is for dist weights to be expressible with separate : and = operators instead of the correct combined := operator?

svenka3 commented 1 year ago

Yes, same for : / operator (2 forms of dist)

Thanks

BTW - I attempted extending ColonEquals with ColonSpaceEquals. Below are files touched

Lexer.cpp LexerFacts.cpp Parser_expressions.cpp Parser_members.cpp Parser_statements.cpp tokenkinds.txt

Is this the correct approach to a potential fix/PR? Is the parser hand-written?

Thanks

svenka3 commented 1 year ago

Quick update - I am able to parse with ColonSpaceEquals as an additional token. Not sure how this gets exposed to Pyslang though - my ultimate goal is to flag such misuse from PySlint - built on pyslang.

MikePopoloski commented 1 year ago

No, you shouldn't make a new token type. Colon and equals are already their own tokens, so you can't special case the whitespace in between. Instead the parser should be modified to allow parsing a dist weigh either with the single token as it does currently or by looking for the two tokens as separate operators.

svenka3 commented 1 year ago

OK, below is what I now have in Lexer.cpp:

case ':':
            char c1;
            switch (peek()) {
                case '=':
                    advance();
                    return create(TokenKind::ColonEquals);
                case '/':
                    switch (peek(1)) {
                        case '/':
                        case '*':
                            return create(TokenKind::Colon);
                    }
                    advance();
                    return create(TokenKind::ColonSlash);
                case ':':
                    advance();
                    return create(TokenKind::DoubleColon);
                case ' ':
                    advance();
                    c1 = peek();
                    if (c1 == '=') {
                      advance();
                      return create(TokenKind::ColonEquals);
                    }

            }

I am able to parse ": =" and ":=" with this change. Is this OK? Do you have a PR in mind for this please? Also my goal will be to get this to pyslang, hope that's automatic in this case (assuming corresponding .a/.so gets updated).

Thanks

MikePopoloski commented 1 year ago

No, you shouldn't be handling this in the lexer. Your change will only handle a single space but that doesn't match the behavior of VCS, which allows arbitrary whitespace, comments, etc to appear between the two tokens. Instead this should be handled in the parser as I described earlier. I can take a look at handling this but not right away.

Also to answer your second question, any changes made to the slang library will be reflected in pyslang, since that's just a light binding layer around the C++ library.

MikePopoloski commented 1 year ago

Added support in e485002607a11274e4289b2256b72c7ee813b26c