With k = 5, I would expect two shared 5mers between these files: CTCTC and CCCCC.
I run KAT with the command: kat sect -E -m 5 -N 1.fa 2.fa and get the following kmer counts:
> 1
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0
Indicating that the 5mers at positions 9 and 18 are shared - this is exactly what we would expect. I also provided the -E flag to KAT which extracts non-repetitive regions (count = 1) to a separate FASTA file:
I have two FASTA files, 1 and 2:
With k = 5, I would expect two shared 5mers between these files: CTCTC and CCCCC.
I run KAT with the command: kat sect -E -m 5 -N 1.fa 2.fa and get the following kmer counts:
Indicating that the 5mers at positions 9 and 18 are shared - this is exactly what we would expect. I also provided the -E flag to KAT which extracts non-repetitive regions (count = 1) to a separate FASTA file:
Here, the positions of the shared 5mers are correct, but the length and sequences are incorrect.