Closed Phelimb closed 8 years ago
Hi Phelim,
Sorry for the late answer (holidays).
There is indeed this limitation that k must be a multiple of 9 and this is a purely practical limitation in the current version of the BFT. It allows to use a fixed length of suffix prefixes to store in compressed containers for an optimal usage of the memory and facilitate a lot of computations for a better insertion / look-up time.
Indeed, the k-mer size in the code is a multiple of the (macro definition) NB_CHAR_SUF_PREF (which is 9) but cannot be modified: It is like this in the code for better code reading / understanding.
Let me know if you have other questions ;)
Guillaume
Thanks! I will ping you if I have other questions.
Why is it that kmers need to be a multiple of 9 ?
It's not entirely obvious to me from reading the paper if this is a theoretical limitation or a practical one?
I can see from the code that kmer size needs to be a multiple of the NB_CHAR_SUF_PREF - can this be arbitrarily modified?