cgat-developers / cgat-apps

cgat-apps repository
Other
33 stars 14 forks source link

Understanding `min_kmer_matches` in `fastqtools.filter_by_sequence` #54

Open paulbrodersen opened 4 years ago

paulbrodersen commented 4 years ago

I am trying to understand the default value for min_kmer_matches in fastqtools.py : filter_by_sequence. The default value for min_kmer_matches is 20 whereas the default kmer_size is 10. As far as I follow the code, the maximum number of bases that can match the k-mer can hence also only be 10. Am I missing something, or do the defaults need adjustment? I would appreciate a second opinion. The relevant file is here.