mcveanlab / mccortex

De novo genome assembly and multisample variant calling
https://github.com/mcveanlab/mccortex/wiki
MIT License
113 stars 25 forks source link

Is it possible to extract kmers in bubbles? #82

Open zeeev opened 5 years ago

zeeev commented 5 years ago

Greetings,

I was wondering is it possible to extract kmers that are in bubbles? I see the bubbles command generates only the variable bases, not fixed width kmers. In otherwords, I want all kmers that, lead into and out of a bubble, for both paths. I should mention this is a diploid.

I was also wondering about the view command. I couldn't find documentation about the format. I'm guessing it's kmer,count,???

GAGCTGCCAACAGCAGAGGTAAA 24 a....C..
AAATCACCGTGAAAATATAATTC 35 a......T
ATCATTATGTATTTAATTACACA 22 ...t...T
winni2k commented 5 years ago

I can't help you with your first question, but if nothing else works, you could try using cortexpy to load the graph into python and do your own bubble search and kmer identification. Disclaimer: I am an author and maintainer of cortexpy. Unfortunately, cortexpy does not have an algorithm for bubble search at the moment and contributions would be welcome.

Regarding your second question: The output format of mccortex view --kmers is canonical kmer, [coverage for each color] [edge set for each color]. There is a cortex spec at the cortexjdk repo and the edge set description can be found there

iqbal-lab commented 5 years ago

for the first question @noporpoise will know. It's not much help, but cortex can consume mccortex binary graphs, and i know cortex bubble calling does output exactly what you want in fasta format

zeeev commented 5 years ago

@iqbal-lab I'd like to try cortex for kmer extraction. Can you point me to the repo, I've briefly searched, but didn't find it? Wasn't cortex earlier than McCortex?

winni2k commented 5 years ago

Is this the correct web page for cortex?

iqbal-lab commented 5 years ago

Dammit I need to close the source forge page. Sorry. Current page is https://github.com/iqbal-lab/cortex