B-Lang-org / bsc

Bluespec Compiler (BSC)
Other
941 stars 145 forks source link

Build system integration is complex #44

Open cbiffle opened 4 years ago

cbiffle commented 4 years ago

Currently, bsc emits Verilog files that are named after modules defined within a source file, rather than the name of the source file. This complicates its use in build systems, which need to be able to precisely determine which artifacts are from which inputs in order to support correct incremental/parallel builds.

For build systems that support dynamic dependencies (e.g. make, or things built atop ninja), there are two features that would help with this.

  1. The ability to generate lists of inputs to compiling a module, a la gcc -MF.
  2. The ability to generate lists of outputs from compiling a module.

Ideally, these could be done without compiling the module, but if that's hard, doing both would be okay.

(Perhaps this can be done today with bluetcl -- the bluetcl docs are ungoogleable if they exist, and there's no Docs link on the Bluespec site. Edit: managed to find a copy of an old user guide posted on the UCSB website which includes docs on Bluetcl toward the end -- I see no facilities that seem relevant to this, but I may not be thinking creatively enough.)

For build systems that mandate static dependencies -- such as Bazel -- an option to control naming of Verilog outputs based on a pattern, or even concatenate all the outputs from Foo.bs into Foo.v, would help. I personally am not using such build systems, so this is lower on my priority list.

Other suggestions welcome!

@arjenroodselaar @bpfoley

quark17 commented 4 years ago

Note that BSC has a -u option for doing its own dependency analysis and recompiling as necessary. If this analysis is not exposed in a convenient way, we can certainly consider adding it.

There are two scripts that you might want to look at:

In src/bluetcl/ there is makedepend.pl which can generate a depend.mk file as was used to generate the dependency files found in src/Libraries/Base{1,2}/. It's an old script, so may not be looking for all dependencies that -u does (which, for example, looks for whether output C and Verilog files need to be generated).

In util/bluetcl-scripts/ there is listVlogFiles.tcl which is a script that we have used to generate the list of Verilog files for a design, to feed into downstream tools (for example, to automate synthesis scripts).

I will have to look back at what bluetcl provides, but if it doesn't provide easy access to lists of inputs and outputs, that would certainly be reasonable to add, to bluetcl or bsc (since the Haskell code already exists for computing this).

cbiffle commented 4 years ago

Can bsc -u run compiles concurrently? (It looks like it would be put asunder by the same _t_o_p.c generation behavior I'm fixing in another issue.) In my tests here, concurrent .bo generation can cut the build time of the Mergesort examples from the classic Bluespec training in half, so I'm pushing toward that.

(Edit: The "in half" is relative to a serial separate compile by invoking bsc. Parallel compile using Ninja is still faster than bsc -u when you're generating a single top-level module, but only by a bit.)

Thanks for the other pointers, I will investigate!

quark17 commented 4 years ago

The link step for Bluesim (bsc -sim -e <topmod>) invokes a C++ compiler multiple times (on cxx files for each module plus some top-level cxx files) and that can be executed in parallel with a flag (-parallel-sim-link). Aside from that, BSC doesn't do anything concurrently. (That's certainly something worth investigating.)

cbiffle commented 4 years ago

Interesting. Is there any way to get bsc to emit the commands rather than running them, so that they could be handled by the build system?

quark17 commented 4 years ago

You can run bsc with -v and it will the print the commands it runs, the files that it reads in, the search paths that it is looking in, etc. But there's no "dry run" flag or anything like that.

I do see, in src/comp/bluetcl.hs that there is a Bluetcl::depend command and one of the subcommands is recomp, which will tell you which source files need to be recompiled.

The place where this is all computed is in src/comp/Depend.hs and the Bluetcl::depend commands exposes some of it as subcommands: chkDeps as recomp, genFileDepend as file, and genDepend as make. It looks like chkDeps is the only exported function that really computes all the input files and generated files, but it doesn't return that info, only a list of the source packages that need to be recompiled. The are other functions return some dependencies and none of the outputs, it looks like. Anyway, I'm open to improvements in all this.

I suspect that you could use the current commands by running depend recomp to get a list of sources to compile and depend file to get the ordering dependencies (to know which have to be serialized).

jameyhicks commented 4 years ago

Connectal has a script to generate a Makefile from a collection of BSV files so to enable parallel make. https://github.com/cambridgehackers/connectal/blob/master/scripts/bsvdepend.py

I think something similar based on src/comp/Depend.hs would be very useful: to generate Makefile or ninja file containing the dependences to enable parallel builds.

thoughtpolice commented 4 years ago

As another note, my utility yosys-bsv is a plugin for Yosys that allows you read Bluespec designs. It's been improved a lot since the FOSS release, and you can probably use it in combination with Yosys to coax out some of this information when they're combined. Broadly, if you install the plugin and use yosys, you can do something like:

yosys -p "plugin -i $PATH/to/bluespec.so; read_bluespec module.bsv; ..."

This transparently invokes the Bluespec compiler in a temporary directory, emits Verilog, then reads all the Verilog into the current synthesis design in a single step. (The compilation step is recursive and because a fresh tempdir is used every time, it can be expensive to do this.) If multiple (* synthesize *) annotations are used in the design then they all get ingested. If the module uses primitives like SizedFIFO that actually require files from lib/Verilog, it will find those files and read them, too. I have a project that does something like:

#! /usr/bin/env bash
yosys -v3 -l synth.log -p "plugin -i $YOSYS_BLUESPEC_DIR/bluespec.so; script build.ys"

The Yosys script (which is simply a linear list of Yosys commands, an alternative to Tcl/Python APIs) is:

read_bluespec -reset pos keccak.bsv
synth_ecp5 -abc2 -retime -top keccak
write_verilog -noattr keccak_synth.v

You could also use synth here if you wanted generic synthesis. You can also do things like module inlining, port renaming (if synthesize attrs aren't enough somehow), etc. I've found Yosys to be an invaluable tool for solving issues like this in a vendor/device agnostic way in other HDLs like Clash as well, though I haven't solved this issue in particular.

So once you have that I'm guessing you have a few options. There are things like the ls command, though I don't know how to get them in JSON (maybe a fix to Yosys would do it.) Alternatively you could use the Python or Tcl APIs to just look at the Design netlist, or write your own plugin.

These are definitely some tractable problems and workarounds for now, though first-class support would be excellent.