DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
687 stars 266 forks source link

How does Kraken2 handle ambiguities #766

Closed Simon602 closed 6 months ago

Simon602 commented 8 months ago

Dear all,

I'm seeking assistance with the following question: How does Kraken2 handle gaps and ambiguities in the sequence? If there are ambiguities, such as R, Y, K, M, in the sequences, does Kraken2 automatically resolve them, or is manual removal necessary? Your insights would be greatly appreciated

jenniferlu717 commented 8 months ago

Do you mean with protein sequences? I believe Kraken2 only takes in DNA sequences and ignores any kmers containing non DNA nucleotides (ACGT)

Simon602 commented 8 months ago

In DNA sequences, if we encounter sequences containing ambiguous nucleotides (such as R, Y, K, M, ...), does Kraken2 automatically resolve them Thanks for your reply

jenniferlu717 commented 6 months ago

No, it will ignore ambiguous nucleotides (or any non ACGT nucleotides)