voichek / kmersGWAS

A library for running k-mers based GWAS
GNU General Public License v3.0
100 stars 24 forks source link

reverse-complement kmers #123

Closed DanaSis closed 2 years ago

DanaSis commented 2 years ago

Hi Yoav great work with the kmer GWAS program! just a question for clarification: I have preformed the analysis (for a diversity panel of 220 sunflower lines) and found 64 significantly associated kmers . then i aligned those k mers to the sunflower genome (using bwa aln; bwa samse) and got back 12 kmers that were aligned uniquely :) but for some of this kmers the alignment returned the reverse-complement sequence. I've noticed that in the manual you mentioned that "k-mer (e.g. AGGCT) and its reverse-complement (e.g. AGCCT) are the same for the purpose of counting or presence/absence patterns." given that, i just waned to make sure that is it completely safe to position (chr : pos) these reverse-complement kmers as in the .bam file?

voichek commented 2 years ago

Dear DanaSis,

I am happy you find interest in our work, and that you got significant k-mers :)

A mapping to the reverse-complement of the reference genome is not a problem, it is as good as the mapping of the regular k-mer.

The comment for the reverse-complement of k-mers is only for the purpose of counting k-mers.

Best, Yoav