marbl / meryl

A genomic k-mer counter (and sequence utility) with nice features.
115 stars 13 forks source link

k-mer size support >32? #3

Open EvdH0 opened 5 years ago

EvdH0 commented 5 years ago

Hi Brian,

Thanks for the rewrite! Is there a way to run meryl with a k-mer size larger than 32? Because running with the following example meryl count and meryl print

>test
ATGAAAATCAAAACTCGCTTCGCGCCAAGCCCAACAGGCTATCTGCA

with k=32 results in :

AAAACTCGCTTCGCGCCAAGCCCAACAGGCTA    1
AAAATCAAAACTCGCTTCGCGCCAAGCCCAAC    1
AAACTCGCTTCGCGCCAAGCCCAACAGGCTAT    1
AAATCAAAACTCGCTTCGCGCCAAGCCCAACA    1
AACTCGCTTCGCGCCAAGCCCAACAGGCTATC    1
AATCAAAACTCGCTTCGCGCCAAGCCCAACAG    1
ACTCGCTTCGCGCCAAGCCCAACAGGCTATCT    1
ATCAAAACTCGCTTCGCGCCAAGCCCAACAGG    1
ATGAAAATCAAAACTCGCTTCGCGCCAAGCCC    1
AGCCTGTTGGGCTTGGCGCGAAGCGAGTTTTG    1
CAGATAGCCTGTTGGGCTTGGCGCGAAGCGAG    1
CGCTTCGCGCCAAGCCCAACAGGCTATCTGCA    1
TCAAAACTCGCTTCGCGCCAAGCCCAACAGGC    1
TCGCTTCGCGCCAAGCCCAACAGGCTATCTGC    1
TTGGGCTTGGCGCGAAGCGAGTTTTGATTTTC    1
TGAAAATCAAAACTCGCTTCGCGCCAAGCCCA    1

but with k=33 results in:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA   8
CAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC   7

which does not seem right.

The full commands are here

meryl count  k=32 test.fna output test.k32.out.merylCount &> test.k32.out.merylCount.std
meryl count  k=33 test.fna output test.k33.out.merylCount &> test.k33.out.merylCount.std

meryl -VV print test.k32.out.merylCount > test.k32.out.histogram 2> test.k32.out.histogram.std
meryl -VV print test.k33.out.merylCount > test.k33.out.histogram 2> test.k33.out.histogram.std

and attached a tarball with the stderr etc. kmertest.tar.gz

Thanks!

Eric