refresh-bio / KMC

Fast and frugal disk based k-mer counter
253 stars 73 forks source link

question about kmc_genome_counts #186

Open weiwsmiling opened 2 years ago

weiwsmiling commented 2 years ago

Dear Sir,

I saw kmc_genome_counts is used in https://github.com/msauria/T2T_Kmer_Analysis for the analysis of the T2T genome. But I cannot find it in the current KMC. I wonder how I would be able to use it? I want to align our data to the T2T genome as well.

Thanks for your help in advance!

Best, Wei

marekkokot commented 2 years ago

Hello,

I don't know this tool, but looking at the compile script it seems it uses a KMC fork, more precisely a specific branch on the specific fork. It is available here: https://github.com/msauria/KMC/tree/kmer_mapping

Maybe @msauria would be more helpful.

Best Marek

weiwsmiling commented 2 years ago

thank you!

msauria commented 2 years ago

Hi,

kmc_genome_count is an extension of KMC's functionality, available at the masuria/KMC fork. There is actually a multithreaded version that is much more efficient that should be going up this week. The function takes a standard KMC database (-ci2) and creates a wiggle track of the number of times the kmer starting at a given position in the genome appears across the entire genome. I'm happy to discuss this and other tools for genome composition analysis by email.