chipsalliance / firrtl-spec

The specification for the FIRRTL language
38 stars 27 forks source link

Make Commas Mandatory or Remove Them #188

Closed seldridge closed 6 months ago

seldridge commented 6 months ago

While never reflected in the grammar, implementations of FIRRTL compilers have always treated commas as whitespace. This historically is because the original FIRRTL compiler implementation (predating the SFC) was written in Stanza and Stanza took the approach of treating commas as whitespace. This was inherited in the SFC grammar description of FIRRTL and was carried forward into the MFC FIRRTL parser.

We should decide what to do here. Likely either make the commas explicit (as the existing grammar indicates) or remove the commas. I'm not aware of other languages which treat commas as whitespace, though there is precedent for languages which do not use commas, typically for function declaration/application (Haskell, ML variants).

I'd be in favor of making the commas explicit. This would be, in practice, entirely backwards compatible because: (1) Chisel has always emitted the commas, (2) the FIRRTL specification has always used commas in examples, and (3) the FIRRTL grammar either ambiguously suggested commas (v0.2.0) or explicitly required them (every version including and since v0.3.0). The only actual way that an implementer of a FIRRTL compiler would likely know that commas are whitespace was if they examined the SFC antlr grammar (which is technically the reason why the MFC parser picked up this quirk).

@mmaloney-sf, @dtzSiFive, or @azidar: do you have any opinions here?

dtzSiFive commented 6 months ago

I think it's safe and very reasonable to make them mandatory in our implementation, policy-wise, for the reasons you outline.

As a tiny note, treating commas as whitespace means fewer tokens and fewer/simpler checking in the current parser approach -- so if just adding comma tokens to the stream this might not be entirely "free" to enforce.

Just want to pin down/brainstorm (please add to this) why this is would be a good thing to make mandatory (reject inputs we would currently accept):

mmaloney-sf commented 6 months ago

I understand that this was a quirk of the original FIRRTL implementation, has this ever been a part of the spec? As far as I can tell, the grammar in the spec has always made commas mandatory.

Regardless, it is absurd that commas are treated as whitespace. Let's tighten up the implementation and make all commas mandatory.

jackkoenig commented 6 months ago

I understand that this was a quirk of the original FIRRTL implementation, has this ever been a part of the spec?

Section Notes on Syntax says:

Commas are treated as whitespace, and may be used by the user for clarity if desired.

Also, in the Grammar, there are no comma tokens. The commas in use there are purely separators, they would need to be in double quotes to be actual commas tokens in the grammar.

mmaloney-sf commented 6 months ago

Also, in the Grammar, there are no comma tokens. The commas in use there are purely separators, they would need to be in double quotes to be actual commas tokens in the grammar.

There are plenty of comma tokens in the grammar. Eg: primop_1expr1int = primop_1exrp1int_keyword , "(", expr , "," , int , ")" ;

Am I overlooking something?

Commas are treated as whitespace, and may be used by the user for clarity if desired.

Good catch on this. Assuming I didn't make a mistake above, this is in contradiction with the grammar.

jackkoenig commented 6 months ago

There are plenty of comma tokens in the grammar. Eg: primop_1expr1int = primop_1exrp1int_keyword , "(", expr , "," , int , ")" ;

Am I overlooking something?

~There's also a semicolon at the end. The comma and semicolon are part of the Grammar specification language's grammar, not the grammar being specified (i.e. FIRRTL's grammar). They need to be in double quotes to be actual tokens in the FIRRTL grammar.~

Lol right there is 1 example where it does have double quotes , "," ,. So yeah I guess it is contradictory now, but it wasn't always. Originally no commas were supposed to matter but that has evolved a bit. In any case it makes sense to formally change this.

seldridge commented 6 months ago

They've been in the grammar since I converted the spec to Markdown. This changed the original v0.2.0 grammar, though it's arguable that the original grammar was more of a guideline for writing a grammar than trying to prescribe exactly what the grammar was.

mmaloney-sf commented 6 months ago

@seldridge with the backstory.

+1 on mandatory commas

dtzSiFive commented 6 months ago

Sounds good, thanks @seldridge !