neherlab / pangraph

A bioinformatic toolkit to align genome assemblies into pangenome graphs
https://neherlab.github.io/pangraph
MIT License
77 stars 7 forks source link

Can I use pangraph for MAGs? #29

Open SilasK opened 2 years ago

SilasK commented 2 years ago

I wonder if it is possible to use pangraph for metagenome assembled genomes which are usually highly fragmented.

mmolari commented 2 years ago

I think in principle one could pass a set of fasta file, in which every record is a contig. At the moment we don't have an option for fragmented genomes, at the current state Pangraph would interpret every contig as a separate genome. Homologous contigs would still be merged in blocks (provided that they are not too small, or too diverged) but every contig would appear as separate "path" in the graph. My guess is that this might be informative if restricted to long contigs (>10kbp?). But not very much for the shorter ones (<1kbp?). Hope this helps!

SilasK commented 2 years ago

If i connect all contigs with NNNNs it will output a wholy conplicated graph, wouldn'it?

I have dificultiy to imagine thatbyou have many fully complete bacterial genomes .

mmolari commented 2 years ago

Yes this is also an option. If you connect contigs with long stretches of N's then one path would correspond to one set of contigs. This will however introduce spurious edges in the graph (these artificial connections between contigs), but homologous stretches of sequence should still be merged into one block. I haven't tested this yet though.