Closed nmflack closed 1 year ago
Hello Nicole, Unfortunately, Merqury is not recommended to use for ONT only assemblies. The bag of k-mers found in the ONT reads are likely to contain systematic errors, which will inflate the QV. I'd recommend to obtain Illumina reads if possible, to further evaluate and polish the genome if the goal is to build a high-quality reference.
Best, Arang
Hi Arang, appreciate the response. That's too bad; ONT is seeing higher mean quality with their new flow cell chemistry, so hopefully things will be different in the future.
Hello! I'm looking for assistance interpreting an odd use case of Merqury.
We recently assembled a chromosome-level diploid mammal genome with nanopore data only. The assembly has a BUSCO score of 94.7%, N50 > 100 Mb, an L50 of 8, and it aligned well with a closely related species reference. Sequencing coverage was 63x.
However, the read set has a median QV of 14.51, which is obviously less than expected by Merqury and likely to skew its quality estimate.
With default settings, the combined Merqury QV for the assembly was 45.3 with 97.5% completeness. I've included one of the k-mer plots below.
I also ran best_k.sh with our diploid genome size (2.5 Gb x 2 = 5 Gb) and read error rate (0.035) and reran Merqury with the suggested k=19. The result was Q53.2. Here's the output of
meryl statistics
for that run, I can also grab the k=21 version if that'd be helpful:Another tool built for long reads (Inspector) scored the assembly as 97.9% complete with QV 31.3, which I have an easier time believing. Still, I'd like to include an accurate interpretation of our Merqury run in our paper in case others are interested in doing the same.
Would you be willing to share your thoughts on these results? There's a massive number of small k-mers, but it looks like they were largely excluded from the assembly. Homozygosity was high, which could explain the lack of single haplotype k-mers along with switch errors called by Inspector.
Many thanks, Nicole