pangenome / pggb

the pangenome graph builder
https://doi.org/10.1101/2023.04.05.535718
MIT License
346 stars 37 forks source link

Gradual increase of pan-genomes #364

Open Tonitsk8264 opened 7 months ago

Tonitsk8264 commented 7 months ago

Dear Developer.

I have a question about PGGB pan-genome construction. I hope you can help me answer:

Does PGGB support gradual increase? For instance, by initially building a pan-genome 'Pn' using n sequences, and subsequently adding a new sequence labeled 'x' to extend the pan-genome from 'Pn' to 'Pn+1', instead of starting the construction of the pan-genome from scratch with these n+1 sequences.

Looking forward to your reply

subwaystation commented 6 months ago

Dear @Tonitsk8264,

PGGB does not support gradual interations of graph buildings. It is all or nothing. You could use GraphAligner to map your additional sequences, convert the mappings to BED and inject these as paths with odgi inject. However, GraphAligner is usually slow with PGGB graphs and it this would break the all-vs-all assumption.

What you could do is to run the pairwise alignments only for the newly introduced sequences. Then you can give SEQWISH the resulting PAF and the one previously created. From here continue PGGB as a usual run.