ChnMasterOG / KmerGO

KmerGO is a user-friendly tool to identify the group-specific sequences on two groups or trait-associated sequences of high throughput sequencing datasets.
13 stars 3 forks source link

Question re: Output #5

Open erin-thei opened 1 year ago

erin-thei commented 1 year ago

Hello,

Trying to understand the output of KmerGO. What's the distinction between kmer vs contig? Are contigs just any sequence found that was greater than the threshold set for kmer?

For example, the output contains case- and/or control- specific kmers and contigs. I used the default setting for kmer length (40), so is a contig considered to be any sequence found in either the case or control that has a length greater than 40?

Thanks for your clarification!

Ying-Lab commented 1 year ago

They are different definitions. Basically, contigs are obtained by reads assembly, representing high-confident fragments of the sequenced genome; while a kmer is just a manually-defined read sequence, which is usually used to act as a feature. In KmerGO, contigs are assembled using CAP3 based on the specific kmers.