marschall-lab / panacus

Panacus is a tool for computing statistics for GFA-formatted pangenome graphs
MIT License
85 stars 5 forks source link

path coordinates #8

Closed baozg closed 1 year ago

baozg commented 1 year ago

Hi,

How could I provide path coordinates for panacus from pggb graph? Since it didn't have an interval in the GFA, this function only works for Minigraph-Cactus graph?

danydoerr commented 1 year ago

panacus does not make use of node annotations, if that is what you mean. It uses the sequence underlying the nodes as reference for the coordinates. In other words, path coordinates should work in pggb as well as Minigraph-Cactus, and I have tested it on both. Does it not work for you?

baozg commented 1 year ago

No. It did work with my pggb graph. Here is the command I use:

panacus hist -c bp -e Col-0.CEN.bed -H -t 12 ../At_Chr1_p90_s10000/panCEN.Chr1.fa.gz.f6e6337.f85392c.20baff7.smooth.final.gfa > Chr1.hist

And here is the bed info

Col-0#Chr1      14842915        17129565
Col-0#Chr2      5281568 5281745
Col-0#Chr3      13596521        15747999
Col-0#Chr4      5256168 5256346
Col-0#Chr5      12395998        14812445
danydoerr commented 1 year ago

Can you send me the graph, too?

baozg commented 1 year ago

The graph is quite large, could I send you a link by email?

danydoerr commented 1 year ago

Yes--Just sent you an email, so you have mine.

baozg commented 1 year ago

Thanks! Sent it. By the way, what's the exact meaning of quorum?

danydoerr commented 1 year ago

Quorum sets the number of paths/groups that must share the counted graph feature when going from n to n+1. The simplest way to understand quorum when it is set to 100%, i.e, -q 1. In that case, panacus would count only those graph features that are shared by all paths/groups. This is commonly known as core pangenome. Setting -q 0.5 is also known as shell pangenome and -q 0.1 cloud pangenome. Here are some more explanations: https://hackmd.io/asojhSf-T8GOJOr9HQj_IA#Panacus

baozg commented 1 year ago

Actually, panacus run successfully with BED. The error is because I use whole genome bed for chr1 graph.