refresh-bio / KMC

Fast and frugal disk based k-mer counter
266 stars 73 forks source link

trying to increase maximal value of a counter (using -cs) #205

Closed benliu5085 closed 1 year ago

benliu5085 commented 1 year ago

Hi, I am trying to use KMC to generate a Kmer counts for my datasets. The program runs perfectly if I just stick to the default parameters, but I do found that most of counts for kmers are 256. After referring to the help info, I found that "-cs" parameter when calling "kmc" (the main program) seems to the solution and "kmc" generated the kmc database correctly (~.kmc_pre and ~.kmc_suf), hopefully.

Then I try to pull out the text file from the kmc database using kmc_tools, but it keeps gives me the error message: "Error: kmc_tools currently does not support k-mer sets without counters. It will be implemented soon. If needed faster please contact authors or post an issue at https://github.com/refresh-bio/KMC/issues."

The scripts I used are: $ kmc -k30 -m258 -t8 -cs1e5 sample1.fastq sample1 . $ kmc_tools -t1 transform sample1 dump sample1.txt

-- Any help is appreciated!

marekkokot commented 1 year ago

Hi, thanks for using KMC!

I am afraid -cs does not accept scientific notation, try -cs100000. Also, I would recommend lowering -m258 or using the default, unless you have a big dataset (even then assigning 258G of RAM is probably much more than KMC will need and the gain in performance may be negligible if any).

Let me know if this helps.

Best Marek

marekkokot commented 1 year ago

Additional explanation: since kmc does not support scientific notation for -cs it will just read this as 1, which is a very special case and the counter are in this case not stored in the kmc database. Because of some technical details kmc_tools does not support kmc database without counters. We definitely want to add support for this, but currently, we are overwhelmed by other things with much higher priorities. I expect that when one will need this we will increase the priority.

benliu5085 commented 1 year ago

It works.

Thank you very much! I am closing this issue.