Big kmtricks index obtained

Hello,

Thanks for trying kmtricks. Just to clarify, kmtricks can be used for two things, 1) Build a membership index by building Bloom filters (Supplementary tables relates to this feature), 2) Build a k-mer count matrix. Since you use DEkupl, I assume you need a count matrix, right ?

I have not noticed any problems in your commands. The difference could be explained by the k-mer filtering (--count-abundance-min in kmtricks and --lower-count in Jellyfish). DEkupl joincount uses also -r (--recurrence-min in kmtricks) and -a (no direct equivalent but can probably be simulated by --merge-abundance-min X --save-if 0, I will check).

Also please note that Jellyfish and kmtricks produce equivalent but not identical outputs because of canonical k-mers. For optimization reasons, kmtricks considers A < C < T < G instead of A < C < G < T.

A new version of kmtricks is coming soon, probably next week if I can finish the documentation. It is faster and more efficient, and includes new features, utilities and API, especially for dealing with kmtricks's files. I you want it before the release, just send me an email.

I hope this help.

Téo

tlemane / kmtricks

Big kmtricks index obtained #11