cucapra / pollen

generating hardware accelerators for pangenomic graph queries
MIT License
24 stars 1 forks source link

Slow odgi: `flatten` #30

Closed anshumanmohan closed 1 year ago

anshumanmohan commented 1 year ago

This PR adds support for odgi flatten. IMO it only really makes sense to run this command with both its flags supplied, though it is happy to run with either. Running odgi flatten -i graph.og -f graph.fasta -b graph.bed produces:

It may be clear to see why these really go hand in hand: the former has no path information, and the latter has no sequence information.

It's a simple enough algorithm, and all tests pass with the current suite of graphs.

anshumanmohan commented 1 year ago

I originally wanted to also do odgi inject in the same PR, since it also has to do with fasta and bed. In theory, running odgi inject -i graph.og -b more_paths.bed -o graph_with_more_paths.og produces a new graph, where any paths chalked out in more_paths.bed are inserted. The .bed file needs to be compatible with the given graph.

Sadly I'm not really able to run this as of writing. I made a k.bed using k.og, then simply changed the name of the path "x" into "z" (i.e., adding a new path that exactly coincides with the known-to-be-correct path "x") but found the resultant graph to be unchanged. I'll investigate further, including snooping around the codebase.