DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
686 stars 267 forks source link

Help with confidence score #683

Open tuttigiuperterraa opened 1 year ago

tuttigiuperterraa commented 1 year ago

Hi everyone, I'm trying to understand how to set the confidence value for my Kraken2 analysis. My dataset consists of all sequences with a length of 31 pb, and the length of the k-mers is also set to 31. Since the length of the sequences coincides with that of the k-mers, I imagine that the score values can only be 1 or 0. This means that the sequence will be classified as belonging to a class or not belonging to any class, but there will be no intermediate score values to indicate a higher or lower affinity for a particular class.

Is it correct to assume that in this case, the confidence value may not be used or have a limited effect on classification, as all sequences classified as belonging to a class will have a score of 1 and all other sequences will have a score of 0?

Can I consider good the classification obtained with --confidence 0.0 in this case?

thank you

sanderdebacker commented 1 year ago

I think your reasoning is correct and the --confidence flag won't add anything to your analysis.