ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
521 stars 111 forks source link

Is there a convenient way to generate multiple pan-genomes based on multiple filter values? #1101

Open JanMiao opened 1 year ago

JanMiao commented 1 year ago

Hello! I am using The Minigraph-Cactus Pangenome Pipeline to construct the pan-genome of pigs. I know that different pan-genomes can be obtained by setting the value of the parameter filter in cactus-graphmap-join. As you mentioned, you found that the giraffe mapping works best when filter=9 is used for constructing the human pan-genome. I would also like to try different filter to see how they affect giraffe mapping. Is there a simple way to generate pan-genomes with different filter values, other than repeatedly running cactus-graphmap-join with different filter value?

glennhickey commented 1 year ago

Great question. This was something I wanted to implement for my own tests and benchmarking but never did. So unfortunately, you need to re-run for each separate value.

And to be honest, I don't think this feature will be coming any time soon -- we're in the process of moving away from the filtered graphs altogether and using vg/giraffe to dynamically extract "sample" graphs based on kmers in the reads. I'll update the cactus documentation to suggest this new workflow as soon as it's fully tested.