lh3 / minigraph

Sequence-to-graph mapper and graph generator
https://lh3.github.io/minigraph
MIT License
419 stars 38 forks source link

add P-line generating path cmd #77

Open ASLeonard opened 2 years ago

ASLeonard commented 2 years ago

I know there is not a lot of support for minigraph P-lines (#27) and it would break if a user added any new assemblies, but tools like vg deconstruct appear to perform better with path information than without. We observed ~6% more SVs (and were generally validated against assembly-based calls) when including path information than without. Arguable this may be an issue with vg rather than minigraph not providing P-lines, but that is the current state of tools.

This code is based heavily on the mgutils.js merge, except takes in a sample file and the paste *bed from stdin to create P-lines for each sample based on the paths taken during --call. I haven't really worked with js before, but I believe this is fairly streamlined.

This has a few obvious limitations.

But overall we found this useful in our work (https://www.biorxiv.org/content/10.1101/2022.09.17.508368v1, page 4, some of the supplementary figures), so others may too while there is still a lot of graph <-> vcf exchange.

Best, Alex