Closed RyanGlScott closed 1 year ago
I'd be happy to submit a fix for this issue, but I'm not sure of the best way to add a test case. One way would be to add a test that calls bsv2bsc
on the test case above and ensure that the resulting .bs
file compiles. However, that would require adding some infrastructure to the test suite to support invoking bsv2bsc
. (There is similar infrastructure for invoking the related bsc2bsv
utility, so perhaps that infrastructure could be adapted.)
An alternative approach would be to invoke the bsc
Haskell API directly, but I have no idea if the test suite is set up to support that.
Another hiccup: the existing run_bsc2bsv
function saves its output to a temporary file that is immediately thrown away afterwards, making it impossible to compile the output using compile_pass
in an .exp
file. Perhaps we need something like compile_{bsc2bsv,bsv2bsc}_pass
?
I take back my comment in https://github.com/B-Lang-org/bsc/issues/529#issuecomment-1372915735. It's not that the output is being written to a temporary file, but rather that the test case wasn't running at all, since I wasn't setting the DO_INTERNAL_CHECKS
environment variable. After setting that, I was able to get something working without too much trouble. I'll submit a patch soon.
Along the way, I noticed a couple of things:
run_bsc2bsv
function is currently broken—see #531. I will include a fix for #531 as part of my patch, and then I'll write a run_bsv2bsc
function based on the fixed version of run_bsc2bsv
.DO_INTERNAL_CHECKS
—see #530. I'm not terribly familiar with how the CI is set up, so won't tackle that as part of this patch.Hi Ryan, thank you for reporting this and for all the work on the issue! I need to take time time to look at the PR, but I should be able to accept it soon. Thank you!
I can give some context first, though. In 2003/2004, Bluespec Inc was created to take the BSC technology (that existed previously at Sandburst Corp) and make it into a commercially available tool. The only syntax accepted by BSC was BH, and it was decided that this was too foreign for hardware engineers, so a new syntax was needed. At the same time, SystemVerilog was being developed and heralded, so it was decided to embed rules in that language. A first step towards that was just to create a parser for a SV-like syntax, BSV. Ultimately, no further step was taken.
Anyway, at that time (2004), everything was written in BH and it was necessary to start converting examples into BSV. So a quick tool was cobbled together (bsc2bsv
), which called the BH parser to read in a file and then called the BSV pretty printer to display it. This program was not perfect -- often the output wouldn't compile, and even when it did, it was often verbose and not the style one would use for writing by hand. (Also, it uses an old style of module instantiation that was attempting to be SV-like, but nowadays no one writes instantiations like that and instead just uses the one-line version with left arrow.) Anyway, the program didn't need to be perfect, it just needed to do 90% percent of the work of translating, and a human would do the rest of the work (and clean it up).
The tool was never released to the public (it wasn't in the BSC release tar-files) and it was mostly never used after 2004. The reverse program, bsv2bsc
, was created at the time, just because it was easy to do -- there wasn't a need for it in 2004, and I'm sure it was used even less. Though in 2017-ish, BH was revived for some users who were familiar with Haskell, and training material was converted from BH to BSV, so it's possible that the tool got used then -- but, again, it would only have needed to do 90% of the work, as a human was going to edit the results anyway.
So it was never expected to work fully, has hardly been used, and was never released. So it's no surprise that it has issues.
These tools were included in the open BSC repo, in case they were useful. Because they're not tested and potentially buggy, they aren't put into the installation files by default and aren't in the releases that we build. We could change that decision though, and reclassify it from "internal tools" to first-class tools, and have it tested properly in the testsuite and included in builds and releases.
We probably also would want to rename them to be bh2bsv
and bsv2bh
, using the current naming convention for the languages; or at least bs2bsv
and bsv2bs
using the file extension convention. (I would guess that the 'c' stands for Classic in 'bsc', but it's too easily confused with the name of the compiler.) We may even want to consider deprecating .bs
and moving to .bh
as the standard extension for BH. (I see that Linguist's database of source languages lists a couple that also use .bs
but none that use .bh
, so it'd help distinguish BH source files just by the extension.)
Pretty-printing is of course also used by BSC, in error messages. So "bugs" in the pretty printer output are worth fixing even if we don't care about bsv2bsc
. However, BSC's error messages don't display full programs, so there's lots of parts of the pretty-printer that aren't used. The BSV pretty printer is probably in good shape, because the user-friendly-ness of BSC for BSV users was a priority for many years. BH was ignored, so it's quite possible that the pretty-printing by BSC is not user friendly. BSC does also print BH in some of its debug dumps, for developers, so it's also used there, but developers are less sensitive to the output being 100% perfect as long as they can read it.
Anyway, your suggestions all seem fine and I should be able to accept the PR. Thank you!
One comment would be, if we care about making the pretty-print output readable/friendly, is that it might be better to print using indentation rather than braces. (I'm not suggesting that you need to do that!) However, there are benefits to using braces, in developer dumps etc, so I wouldn't want to print that way everywhere. Just something to note.
Thanks for the additional context! For what it's worth, my actual motivation is that I am developing a Haskell library for representing Bluespec ASTs, and I am using the code in bsc
as a reference for how to pretty-print ASTs. I noticed this bug along the way, and bsv2bsc
just so happens to be the most convenient way that I've found for reproducing the bug. I'd be thrilled if there were a different, less convoluted way to reproduce it.
Ah, very cool! Feel free to submit more fixes or libraries.
There is a way to dump the pretty-printed parsing, using BSC's flags for dumping after each stage:
bsc -dparsed=Bug529parsed.bs Bug529.bs
If you run BSC with -v
, it will print the name and timestamp as it enters each stage, and for any one of those stages you can use -d<stagename>
to dump the output to the screen or -d<stagename>=<filename>
to dump to a file (where %f
, %p
, and %m
can be used as macros for the file, package, or module name). (Running bsc -help-hidden
will show some documentation for these developer flags.)
It turns out there there is already a testsuite a directory, bsc.syntax/bsv05_parse_pretty/
, which exists for testing the BSV pretty printer by using dparsed=
and then attempting to read back in the file. My suggestion would be to start a new directory, bsc.syntax/bh_parse_pretty/
, which does something similar for BH pretty printing, and put examples there (instead of creating a new entry under bsc.bugs
). This would let you test both let
and letseq
, since your source file would be BH. You can name the example Let.bs
or PPrintLet.bs
, and in the .exp
file you can have a comment indicating that it's a test for issue 529.
Anyway, sorry to reward your effort with more work. If it's too much, I can do it, but I'd certainly appreciate your help to do it.
Also, FYI, we have a long-term goal of making BSC (and its codebase) more modular. For example, making the CSyntax and its utilities into their own library, and the parsers into their own libraries, etc. And then BSC is just a top-level project that imports these libraries. So that people can use their them in their own projects, and so that people can experiment with implementing their own parts of the flow (new parsers, new backends, new features in between existing stages, etc).
If that fits with your use -- for example, having the CSyntax be a Cabal library that projects can import -- then feel free to propose or submit changes in that direction.
There is a way to dump the pretty-printed parsing, using BSC's flags for dumping after each stage:
bsc -dparsed=Bug529parsed.bs Bug529.bs
If you run BSC with
-v
, it will print the name and timestamp as it enters each stage, and for any one of those stages you can use-d<stagename>
to dump the output to the screen or-d<stagename>=<filename>
to dump to a file (where%f
,%p
, and%m
can be used as macros for the file, package, or module name). (Runningbsc -help-hidden
will show some documentation for these developer flags.)
Ooh, this is exactly what I wanted earlier, but failed to find in the bsc --help
output. Thanks! (Relatedly, is there a list of possible <stagename>s
somewhere?)
It turns out there there is already a testsuite a directory,
bsc.syntax/bsv05_parse_pretty/
, which exists for testing the BSV pretty printer by usingdparsed=
and then attempting to read back in the file. My suggestion would be to start a new directory,bsc.syntax/bh_parse_pretty/
, which does something similar for BH pretty printing, and put examples there (instead of creating a new entry underbsc.bugs
). This would let you test bothlet
andletseq
, since your source file would be BH. You can name the exampleLet.bs
orPPrintLet.bs
, and in the.exp
file you can have a comment indicating that it's a test for issue 529.
Perfect, that sounds much more direct than the bsv2bsc
-based approach that I was using. As an added bonus, this approach would be testable without the use of DO_INTERNAL_CHECKS
. I'll change #532 to use this new approach.
Anyway, sorry to reward your effort with more work. If it's too much, I can do it, but I'd certainly appreciate your help to do it.
No problem! I'm happy to start over if the end result becomes nicer.
Also, FYI, we have a long-term goal of making BSC (and its codebase) more modular. For example, making the CSyntax and its utilities into their own library, and the parsers into their own libraries, etc. And then BSC is just a top-level project that imports these libraries. So that people can use their them in their own projects, and so that people can experiment with implementing their own parts of the flow (new parsers, new backends, new features in between existing stages, etc).
Ah, good to know! Indeed, I am more-or-less copy-pasting CSyntax
(and everything that it imports) into a separate, .cabal
-based library. I'm only working with the pretty-printer–related parts of CSyntax
at the moment, but I'd like to use its parser at some point as well.
If that fits with your use -- for example, having the CSyntax be a Cabal library that projects can import -- then feel free to propose or submit changes in that direction.
I might be interested in working on that, although I'd need to figure out how to make a .cabal
-based project structure coexist with bsc
's build system. Is there a bsc
issue somewhere that is tracking this effort?
(Relatedly, is there a list of possible
<stagename>s
somewhere?)
If you run bsc -help-hidden
, the stage dump flags are listed towards the end -- just before a listing of all the trace flags (debugging prints inside the stages) that BSC knows about.
Is there a
bsc
issue somewhere that is tracking this effort?
I don't see one specifically for that. Issue #22 maybe -- it's about splitting up the codebase to allow parallel compilation for improved compile time, but it does mention Cabal as an option for that. PR #166 was a draft of support, but BSC has changed since then, so it would need to be updated. Issue #187 and PR #267 are similar, but for Stack instead of Cabal. (In a comment on that PR, I explained my thinking on how both Cabal and Stack building could be supported in the BSC makefile, as alternatives to the default compilation. My opinions might have changed since then; I'd have to stop and think about it.) Issue 37 is just generally about defining the supported OS/toolchains.
I recently noticed a bug in the way
bsc
pretty-printslet
andletseq
statements. The easiest way to observe the bug is by calling thebsv2bsc
utility on this program:Running
bsv2bsc
on this program will generate:The
letseq hw = "Hello, World!";;
bit is not valid Bluespec Haskell. If you try compiling this, it will fail with:Why are there two semicolons after the
letseq
statement? ThePPrint CExpr
instance separates each statement in a module with a semicolon (see here), and thePPrint CDefl
instance also separates each clause in in aletseq
statement with a semicolon (see here).One way to fix the issue is the add intervening braces, like so:
With this patch applied,
bsv2bsc
generates this code instead:This time,
letseq { hw = "Hello, World!"; };
is valid BH.There is a similar bug involving
let
/letseq
expressions (as opposed tolet
/letseq
statements). It can be fixed in a similar manner: