Closed tbrown91 closed 4 years ago
k=5 is likely the problem. k=6 might work, k=7 should work.
Meryl hasn't seen much (if any) testing at the small k sizes (k < 16). There's an implementation detail that is probably assuming k > 6. The large suffixSize you're seeing is indicating some value went negative.
Thank you, I will try it. I took k=5 from the best_k.sh output, which gave 5.05... how should this result be interpretted, or is there a better way to choose k? Working currently with a ~1G genome
A k of 5 is way too low for a 1gb genome, it would be too low for even a bacterial genome. If the best_k.sh is outputting that low a value, then it is bug in mercury, I'd suggest opening an issue there. For a 1gb genome I'd use at least a 21mer.
Hi, I'm having trouble setting up my meryl dbs for a set of 10x data. I ran _submit_build_10x.sh from merqury and am getting a segmentation fault in the union-sum step. I am only using k=5, but am getting crazy number coming up for suffixsize. Log file is below:
asm_bApuApu_10x.union_sum.27859520.log
Here is an example of one of the 8 count log files:
asm_bApuApu_10x.count.27859519_1.log
I am hoping that a k of 5 is not an issue, but am hoping you'll be able to point me to some parameters in the _submit_build.sh file that should be changed.
Meryl release: v1.0