Open kfletcher88 opened 4 years ago
"mc" stands for "mmer count". The higher the count, the higher the likelihood the k-mer is from a repeat. The shimmer-r controls the reduce level. The smaller shimmer-r
given more dense SHIMMER for index (-> lager index file, more sensitive for overlapping.)
For "unique" part of the genome, the mc
should be more or less independent of shimmer-r
. However, increasing SHIMMER density would increase mc
. This is my current guess.
Hi,
I am exploring using Peregrine with some Illumina corrected single molecule reads (>99% ID to Illumina reference). Sequenced to ~ 250x. I was wondering if and what the correlation between shimmer-r and mc was? Explicitly, does the the SHIMMER count increase as the reduction factor is increased? Or am I misinterpreting the documentation?
I am trying to assemble a heterozygous (~1%), highly repetitive (~70%), diploid genome and am obtaining an over-inflated (3 to 4 x size) highly fragmented output. At the moment I would be happy to obtain a consensus assembly. Any advice on parameters to tweak would be appreciated. Would increasing the reduction factor help remove redundancy?
Thanks Kyle