refresh-bio / KMC

Fast and frugal disk based k-mer counter
266 stars 73 forks source link

Option for kmc_dump to be sorted by kmer? #131

Closed tseemann closed 5 years ago

tseemann commented 5 years ago

I know kmc_dump kmcdb /dev/stdout | sort -k1,1d > kmers.tsv works, but would it be more efficient for kmc_dump to do it internally?

marekkokot commented 5 years ago

Use kmc_tools transform kmcdb dump -s kmers.tsv. kmc_tools documentation is here, what you need is described on pages 8 and 9. In general, it is recommended to use kmc_tools instead of kmc_dump.

tseemann commented 5 years ago

Nice! I didn't see this option!

 For dump operation there are additional oper_params:
  -s - sorted output

Thank you.
I will switch to kmc_tools now.

lmanchon commented 1 year ago

yes, it works but it is sorted by kmer string not by occurrence.