cpockrandt / genmap

GenMap - Fast and Exact Computation of Genome Mappability
Other
100 stars 18 forks source link

cannot find all occuring 4-mer for TCTA on reverse strand given example #20

Closed thomasvangurp closed 3 years ago

thomasvangurp commented 3 years ago

I've tried testing the example using the following commands:

(base) thomas@NAK-Test-01:/tmp$ genmap --version
GenMap version: 1.3.0
SeqAn version: 2.4.1
#contents of test.fa:
(base) thomas@NAK-Test-01:/tmp$ cat /tmp/test.fa 
>test
ATCTAGGCTAATCTA
#indexing test.fa:
genmap index -F /tmp/test.fa -I /tmp/index
#mapping k-mer size 4 with 1 mismatch
genmap map -K 4 -E 1 -d -v -I  /tmp/index -O /tmp/
#output file
(base) thomas@NAK-Test-01:/tmp$ cat /tmp/test.genmap.csv 
"k-mer";"+ strand test.fa";"- strand test.fa"
0,0;0,0|0,10;
0,1;0,1|0,6|0,11;0,3
0,2;0,2|0,7;0,2|0,7
0,3;0,3;0,1|0,6|0,11
0,4;0,4;0,5
0,5;0,5;0,4
0,6;0,1|0,6|0,11;0,3
0,7;0,2|0,7;0,2
0,8;0,8;
0,9;0,9;
0,10;0,0|0,10;
0,11;0,1|0,6|0,11;0,3

All kmers for TCTA are detected for the forward strand (0,1|0,6|0,11), but not on the reverse strand (only 0,3 is found, 0,13 is missing).

cpockrandt commented 3 years ago

Hi @thomasvangurp!

I think you are referring to the example in the Wiki. There's indeed an error in the example. There is actually no match at position 13, I think I made small changes to the text at some point and forgot to update that part.

I fixed the entry in the wiki. Let me know if anything else is unclear!