Open RyanGlScott opened 10 months ago
As noted on issue 663, there are likely to be many bugs in the pretty-printing, but thank you again for reporting.
This a problem caused by the fact that <=
has two meanings in BSV, depending on whether it's an action or expression context. The parentheses indicate an expression context, and one way to fix this is to remove the parentheses as you mention. But note that there is another way to fix it. An expression can be of type Action, and there is a syntax for creating action expressions -- action..endaction
-- so this can be inserted inside the parentheses:
(action v[idx] <= 0; endaction);
But it would be preferable to print without the unnecessary double wrappers.
This might suggest that the pretty-printing routines need to keep track of the context: the printing of Cwrite
would add action..endaction
if the context is expression not action; and the printing of CSExpr _ e
would specify action context when printing e
. However, my suspicion is that action...endaction
in the original BSV would show up as an explicit structure in the parsed CSyntax, and so the pretty-printer could just naively print what it sees, because the action..endaction
is in the code. And I confirm that this is true, by looking at the output of -dparsed
with action..endaction
inside the parentheses. On the other hand, the parentheses are not an explicit construct in the parsed CSyntax.
In any case, it seems like an error for the pretty-printing of Cwrite
to insert parentheses for precedence; it should print action..endaction
. So maybe the simplest fix is to change that from using pparen
(which prints (..)
on True) to some other function that prints action..endaction
on True. This would still have an unnecessary level of wrapper, but it would at least parse correctly. Although when I tested the following, the parser didn't like the semicolon after the action block:
rule hello_world (True);
action v[idx] <= 0; endaction;
$display("hello");
endrule: hello_world
If I remove the semicolon, it parses, oddly!
rule hello_world (True);
action v[idx] <= 0; endaction
$display("hello");
endrule: hello_world
So maybe instead of pparen
, it needs to print (action..endaction)
, with the parens, at least until it's determined that the parser is wrong and is fixed.
Eliminating the extra level of wrapper might still be good to do, and removing the p+2
might do it (I haven't looked closely into it though). As for this:
It's unclear to me if this would have ramifications when pretty-printing other forms of CSExprs, however.
One thing to do is to check whether it makes a difference to the testsuite. Specifically the bsc.syntax
pretty-printer tests, which would be explicitly testing the output. (The rest of the testsuite may not be checking the exact text of messages, so you might not see a change in testsuite results even if there's a user-visible difference.) I would guess that there's very few pretty-printer tests, though, and that we really need to add more.
One way to check would be to run a script on all .bsv files in the testsuite, to check whether the -dparsed
output compiles (and maybe whether the -dparsed
output the second time is identical? for lack of a good way to check if the output is the same as the original). Note that you can use -KILLparsed
alongside -dparsed
, to make BSC exit after dumping that stage.
I do have access to the commit logs for BSC before going open source, so I'm able to see when those two lines for CSExpr
were added, in case that sheds any light. These lines date back to 2004, when the BSV language was being created. There was a "demo" (as it was called in the logs and emails of the time) that was being developed, and the BSV parser and pretty-printer were being adjusted based on the source code of that demo. I believe that "demo" was Pong, which existed in Classic and was being translated to BSV (as BSV was being created, to support that translation). In April 2004, the one line for printing CSExpr
(with p+1
) was split into the two that we see now (with a line for "case" with no change to p
and everything else with p+2
-- I have no idea why that increased, or even how an increment of more than 1 even makes a difference). The commit seems to be fixing up the handling of case statements in various places; the pprint of Ccase
adds a call to pparen
around it -- to support case expressions, not just case statements) and presumably the special arm for CSExpr (Ccase)
is to prevent the unnecessary extra wrapper parens that would result. Not just unnecessary actually ... it would change the meaning from a case statement to a case expression. This maybe also argues again for a pretty printer that keeps track of the context (statement/action vs expr), so it knows that no parens are needed for Ccase
in a statement context.
It also suggests another fix: In the same way that Ccase
is specially handled in CSExpr
, you could add a special arm for Cwrite
that also knows not to call it with increased parens.
Although, actually, why is p
being passed down -- that's the precedence of the parent construct's context. Whether a do
or module
or action
block needs parentheses around the whole thing should have no effect on the need for parens around individual statements. The sub calls to pprint should pass 0
or 1
and not p
and p+1
, right? (Unless I'm misunderstanding something -- as clearly I am, if 2
has some kind of meaning.)
Anyway, that's a lot of rambling. Does any of that help? I guess I'm saying that I don't currently recall how all this works, without spending time reading the code, and I don't have time for that. But feel free to spend time on that and submit PRs if you'd like. And tests are important, and we lack them, so feel free to run a script over all the source files in the testsuite as a manual test; and feel free to add any particularly interesting or broken ones as a automatic test in the testsuite. And in particular, the Pong example in BSV may have been the guiding example for some of what you're seeing.
The Pong example is in testsuite/bsc.bsv_examples/pong/
. I think the original Classic code is in bsc.bsc_examples/pong/
(note bsc_
instead of bsv_
). So if you want to test what effect a change has, you can try it first on the BSV Pong. And then I'd suggest trying it on all the files in bsc.bsv_examples
. I think that directory was mostly created around 2004, when BSV syntax was being developed. (Some that were added later, as late as 2013, are: the SHA directories, PAClib, mimo, FloatTest, FP, GlibcRandom. Another early example alongside Pong is the "mesa" example.)
Thanks for the detailed explanation!
For what it's worth, I found a workaround for this issue as a client who is generating code using the bsc
API: use CBinOp ... idAssign ...
instead of Cwrite
to represent register updates. The bug in https://github.com/B-Lang-org/bsc/issues/662#issue-2080558250 specifically affects Cwrite
, but CBinOp ... idAssign ...
isn't affected. Moreover, there is code elsewhere in the pretty-printer for printing idAssign
as <=
here, so CBinOp ... idAssign ...
works regardless of whether you are printing the AST as Bluespec Haskell or as Bluespec SystemVerilog.
In short, it would be nice to fix this issue, but it's not as urgent for my needs as I originally thought it to be.
Given this BSV code:
Compiling it with
bsc -dparsed=FooParsed.bsv Foo.bsv
yields:This is almost the same code, but there is one key difference: the
(v[idx] <= 0);
statement has surrounding parentheses. This change is significant enough to causebsc
to reject it:Removing the parentheses causes
bsc
to accept the code.The culprit is likely the
CSExpr _ e
case in this code:https://github.com/B-Lang-org/bsc/blob/f00d205867eefe09c60e11b4df155bb87041799a/src/comp/CVPrint.hs#L698-L699
This causes the precedence to be
p+2
when pretty-printing the expressionv[idx] <= 0
, which is printed using this code:https://github.com/B-Lang-org/bsc/blob/f00d205867eefe09c60e11b4df155bb87041799a/src/comp/CVPrint.hs#L572
Because the precedence is greater than
0
, this causes the overall expression to be parenthesized bypparen
.What is interesting is that there is a separate
CSExpr _ e@(Ccase _ _ _)
that callspvPrint
at precedencep
instead ofp+2
. It's unclear to me why this difference exists, but it is likely that if theCSExpr _ e
case changed its precedence top
, then this issue would be resolved. It's unclear to me if this would have ramifications when pretty-printing other forms ofCSExpr
s, however.