DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
687 stars 267 forks source link

can get mapped region in output result? #846

Open jyl-hb opened 5 days ago

jyl-hb commented 5 days ago

Hello all,

In software docs, the 5 columns of output result file is a space-delimited list indicating the LCA mapping of each k-mer in the sequence(s). For example, "562:13 561:4 A:31 0:1 562:3" would indicate that: the first 13 k-mers mapped to taxonomy ID #562 the next 4 k-mers mapped to taxonomy ID #561 the next 31 k-mers contained an ambiguous nucleotide the next k-mer was not in the database the last 3 k-mers mapped to taxonomy ID #562

Is this result can be explained in this way? the input sequence length is 51, 1-13 bases mapped to 562 14-17 bases mapped to 561 18-48 bases unmapped to all sequenceindb 49-51 bases mapped to 562 finally, the input sequence will be classfied to 562, because the ratio of bases mapped to 562 is the largest

Thank you.

hedy-ella commented 2 days ago

不 是前13个kmer映射到562(每个kmer在默认长度下为35bp 也就是前48bp比对到562) 而不是前13bp的序列后面以此类推

jyl-hb commented 1 day ago

不 是前13个kmer映射到562(每个kmer在默认长度下为35bp 也就是前48bp比对到562) 而不是前13bp的序列后面以此类推

谢谢您的回复,这里没太理解。这里的48bp是13+35吗?我理解的是前13个kmer,每个kmer默认长度是35bp,应该是13*35bp比对到562?

hedy-ella commented 1 day ago

你可以看一下这个链接 https://blog.csdn.net/u010608296/article/details/114134044 对kmer的概念解释的比较清楚 image