Closed Glfrey closed 2 years ago
Here are my two cents: you can use assembled whole genomes as an alternative to reads. In fact we have done this by indexing refseq.
Of course, the index is still k-mer based and there is an intrinsic loss of information from a long sequence to a k-mer based representation, unless you store additional information.
We have recently developed efficient datastructures to keep positional information in top of the de bruin graph (counting de bruin graphs). The position information allows you to losslessly encode the input sequence, including reference sequences.
Hope this helps. G
—
Gunnar Rätsch
http://bioweb.me/gr-contact
On 4 Jul 2022, at 13:03, Gillian Reynolds @.***> wrote:
This isn't an issue per se, but rather a theoretical question. Could metagraph be used for a non-read based pangenome analysis? II understand it would function beautifully for a read-based analysis, but if an abundance of beautifully assembled genomes should present themselves for pan-genome analysis (wouldn't that be nice), could metagraph lend itself to such an application without needing to resort back to reads? Originally I thought so until I remembered that it operates using k-mers instead of the usual non-overlapping chunks I see used for pan genomes. Would it be possible to overcome this?
I see metagraph being mentioned in quite a few recent pan genome papers so I think I'm probably not the only one wondering this.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.
Brilliant, thank you. Metagraph is truly a game changing tool and I'm very grateful for the responsiveness to my (never-ending) queries. I'll close this as it's not really an issue but is it worth posting this information in the GitHub readme so others can see?
Thanks for your suggestion. I added a line about this to the readme.
This isn't an issue per se, but rather a theoretical question. Could metagraph be used for a non-read based pangenome analysis? II understand it would function beautifully for a read-based analysis, but if an abundance of beautifully assembled genomes should present themselves for pan-genome analysis (wouldn't that be nice), could metagraph lend itself to such an application without needing to resort back to reads? Originally I thought so until I remembered that it operates using k-mers instead of the usual non-overlapping chunks I see used for pan genomes. Would it be possible to overcome this?
I see metagraph being mentioned in quite a few recent pan genome papers so I think I'm probably not the only one wondering this.