Open anshumanmohan opened 1 year ago
Wonderful; thank you for summarizing this so clearly! Indeed, both of these would be great advances that would "modernize" our hardware generation.
One small operational note: the eDSL adoption can happen incrementally, if it's a lot of work. That is, it's not necessary to flip everything over all at once; it suffices to convert individual pieces to use the builder
library one at a time since it interoperates with the "plain" AST library pretty smoothly.
Thanks Adrian! Dropping the "triage required" tag. This issue remains up for grabs.
This looks good to me! I agree that switching over the builder library isn't a top priority, but it would certainly be nice to at some point.
I don't necessarily think it is a bad thing to use odgi
in our hardware generation if it turns out that it's faster to transform a .gfa
graph to an .og
graph and then use odgi
to get its dimensions, rather than just obtaining the dimensions from the .gfa
file directly, but since .gfa
files seem to be sort of a "standard representation" for pangenomic graphs, it would be good to at least transition exine
to taking in .gfa
files as input rather than .og
files, which it currently expects. I also think it would be a win just to avoid making pollen users install odgi
.
Indeed. Sticking with .gfa
files is the right thing to do here because (a) they are the "source of truth," (b) odgi is a difficult-to-install dependency (to put it lightly), and (c) performance at hardware-generation time is not very important.
Currently, calyx_depth.py generates Calyx code
parse_data
which in turn uses the Python bindings).I propose to change both of these.
These relatively minor gains aside, the true win is that, after this, we will truly be using odgi only as a source of inspiration and oracle-testing, not as part of our workflow.
We currently do something like:
graph.gfa --odgi build--> graph.og --parse_data.py--> graph.data
calyx_depth.py (with one call to parse_data.py to learn the graph's dimensions) --> calyx_depth.futil
calyx_depth.futil
tangoes withgraph.data
in the usual Calyx way.My proposal is something more like:
graph.gfa --pollen_data_gen--> graph.data
calyx_depth.py (with one call to mygfa to learn the graph's dimensions, and with calls to the Calyx eDSL for lightening) --> calyx_depth.futil
calyx_depth.futil
tangoes withgraph.data
in the usual Calyx way.The first piece of this is already there: I already generate the same
graph.data
usingpollen_data_gen
as one would have using the usual route. The third piece should work if the first two work.The second piece decomposes into two unrelated parts: