Closed waltercostamb closed 11 months ago
Hello Maria,
Gerbil will not skip any k-mer if the option -l is set to 1. However, the reverse complement of the DNA sequence GGTACGCCC is GGGCGTACC (reverse the order of the original sequence and exchange T<->A and C<->G). When normalizing, Gerbil prefers GGGCGTACC over GGTACGCCC (corresponds to the lexicographical order). If required, you can disable the normalization with the option -d (e.g. for single-stranded DNA sequence).
Best regards, Marius
Hello Marius,
thank you for the prompt answer. I noticed the reverse complement was incorrect. I checked again, and GGGCGTACC shows up 17 times (for GGTACGCCC) plus 12 for GGGCGTAC.
Best regards, Maria
Hello developers!
I have a question about the functioning of Gerbil. Does it skip some subsequences of a FASTA genome when counting kmers? I have a genome which contains subsequence GGTACGCCC. If I grep it in the genome, it shows 17 hits:
However, if I run Gerbil as below I do not find this kmer or its reverse complement CCAUGCGGG. Could you please help me to understand this result?
Thank you in advance! Maria