Qucs / ADMS

ADMS is a code generator for the Verilog-AMS language
GNU General Public License v3.0
94 stars 32 forks source link

Update ADMS parser to handle Accellera standard disciplines header file #103

Closed ngwood closed 2 years ago

ngwood commented 2 years ago

This pull request addresses two outstanding issues preventing ADMS from being able to parse the Accellera standard disciplines header file:

  1. Allow an optional semicolon after a discipline or a nature identifier in a nature and discipline declaration, respectively. This enhanced notation was introduced in v2.3.0 of the Accellera Verilog-AMS Language Reference Manual. This fixes issue #39.
  2. Process escaped identifiers that can be treated as simple identifiers.
felix-salfelder commented 2 years ago

On Mon, Jan 03, 2022 at 02:29:43PM -0800, Neal wrote:

  • Allow semicolon after discipline/nature identifier

There might be neater way to specify an optional semicolon, but nevermind (merged).

  • Treat escaped identifiers as simple identifiers

I don't understand this.

ngwood commented 2 years ago

There may well be a better way, but the change is innocuous and more importantly works.

Regarding escaped identifiers, Verilog-AMS allows for variables with characters other than letters and underscores. Simple identifiers have letters, underscores and dollar signs (but not leading); no one should be using dollar signs though, so I wouldn't suggest we change this. Escaped identifiers all users to create variable names containing all of the other printable ASCII characters! There are four cases to point out; currently ADMS only handles one of them; my pull request handles a second one; the other two are not important:

Simple Identifier without dollar sign:
e.g. "my_value"
ADMS can currently parse this.

Simple identifier, with dollar sign:
e.g. "my$value"
ADMS cannot currently parse this.

Escaped identifier, equivalent to simple identifier without dollar sign:
e.g. "\my_value " == "my_value"
ADMS cannot currently parse this.
My pull request allows ADMS to parse this correctly and unambiguously
by recognising that these are equivalent identifiers. It will then allow use
of the Accellera standard disciplines file.

Escaped identifier with exotic ASCII characters:
e.g. "\my#value "
ADMS cannot currently parse this.

Does this make sense?

felix-salfelder commented 2 years ago

On Tue, Jan 04, 2022 at 02:38:10AM -0800, Neal wrote:

Escaped identifier with exotic ASCII characters: e.g. "\my#value " ADMS cannot currently parse this.


Does this make sense?

Thanks.

I think I get it now. Does "\my$value " in the input (still?) give a sensible error message?

ngwood commented 2 years ago

If I change

discipline \logic ;

to

discipline \log!c ;

I get the same error as before.

[fatal..] disciplines.vams: during lexical analysis syntax error at line 19 -- see '\'

This is because the bit between the opening \ and the closing `, namelylog!c, is invalid inside ADMS still, because it contains something other than letters or underscores. When it is justlogic`, then it is a simple identifier and ADMS can use it as such.

Just to be safe, I also checked that I could still use escaped sequences inside of strings and everything is still working.

This fix essentially says that if ADMS see something that looks like an escaped identifier, e.g., \my_value, then use the my_value as the identifier. If the my_value bit were my#value then it'd still fail as before as it has something other than letters or underscores. I don't see any value in providing any more support for escaped identifiers than this.

ngwood commented 2 years ago

Note that the Verilog-AMS LRM say that we need to treat \my_value and my_value the same. This is exactly what this pull request does. Thus

real my_value;
\my_value =1.0;

is valid syntax.

felix-salfelder commented 2 years ago

On Tue, Jan 04, 2022 at 04:22:25AM -0800, Neal wrote:

If I change

discipline \logic ;

to

discipline \log!c ;

I get the same error as before.

OK merged

thanks.

ngwood commented 2 years ago

I take that back! The following example still doesn't work with ADMS. It still treats \my_value and my_value as different things. It is therefore not compliant with the standard in that regards. It does however let you use \my_value as an identifier. In that sense it isn't entirely innocuous, as someone may expect \my_value and my_value to be the same. I will fix this at some point though.

ngwood commented 2 years ago

I should have used TKSTRIPPEDRETURN, sorry! With this change everything I mentioned before is correct.

diff --git a/admsXml/verilogaLex.l b/admsXml/verilogaLex.l
index 86393c4..443658e 100644
--- a/admsXml/verilogaLex.l
+++ b/admsXml/verilogaLex.l
@@ -274,7 +274,7 @@ INF {TKRETURN(yytext,yyleng); return tk_inf;}
 \|\| {TKRETURN(yytext,yyleng); return tk_or;}
 \^\~ {TKRETURN(yytext,yyleng); return tk_bitwise_equr;}

-\\{ident}" " {TKRETURN(yytext,yyleng); return tk_ident;}
+\\{ident}" " {TKSTRIPPEDRETURN(yytext,yyleng); return tk_ident;}
 \${ident} {TKRETURN(yytext,yyleng); return tk_dollar_ident;}
 {char} {TKSTRIPPEDRETURN(yytext,yyleng); return tk_char;}
 {b8_int} {TKRETURN(yytext,yyleng); return tk_integer;}

Verilog-A:

real \qwerty ; // use of escaped identifier
qwerty = 1.0; // use of simple identifier

Xyce C++:

double qwerty=0.0;
qwerty = 1.0;
ngwood commented 2 years ago

Commit cbb960e seems to solve the issue of the optional semicolon after a nature and discipline identifier in a nature and discipline declaration, respectively. Maybe worth someone else testing?