cucapra / pollen

generating hardware accelerators for pangenomic graph queries
MIT License
24 stars 1 forks source link

Node depth algorithm for a single node #7

Closed susan-garry closed 1 year ago

susan-garry commented 1 year ago

calyx-pangenome/prototype contains a node depth implementation in calyx that can compute the node depth for a single node. This version of the node depth algorithm also computes the node's unique depth, i.e. the number of unique paths which cross the node, and can consider only a subset of paths.

The key files that have been changed are

This implementation generates the hardware based on the maximum number of paths, nodes, and steps per node that we want to support. The defaults are 16, 15, and 15 respectfully, but it can accept user input via the command line (for more information, try running python prototype/calyx_depth_simple.py --help from the root directory).

The hardware expects to be given a list of path ids corresponding to each step on the node (note that in the odgi implementation, the steps are stored as a vector, where each step is represented by a tuple including the path id of the step), and a vector, paths_to_consider, where paths_on_node[id] == 1 if and only if we want to consider the path whose id is id. This vector is generated from a separate input file. Node depth then iterates through the list of path ids, checks if we should consider this path in our calculations, and if so, adds one to the accumulator. To compute the unique node depth, the vector paths_on_node is computed, where paths_on_node[id] == 1 if and only if the path whose id is id crosses the node. Then depth.uniq = sum(paths_on_node & paths_to_consider).

prototype/calyx_depth_simple.py dumps the calyx implementation to the terminal. To run the implementation, save this to a file (e.g., via python3 prototype/calyx_depth_simple.py > depth.futil) and use fud exec depth.futil --to dat --through verilog -s verilog.data depth.data, where depth.data can be generated via prototype/parse_data.py.

I tested this only on a few simple examples, but there are no known bugs in this implementation.