marbl / merqury

k-mer based assembly evaluation
Other
272 stars 19 forks source link

QV i n some chr is +inf #78

Closed HaoWangLYL closed 9 months ago

HaoWangLYL commented 2 years ago

Hi @arangrhie, I don't why in some chr. the qv value is +inf by hifi reads Chr2 22 37434433 75.5307 2.79854e-08 Chr6 164 31544894 66.063 2.47569e-07 Chr12 0 26187984 +inf 0 Chr1 14 43855265 78.1811 1.52015e-08 Chr4 2645 35245231 54.4688 3.57373e-06 Chr8 0 28478511 +inf 0 Chr11 4 31803187 82.2263 5.98922e-09 Chr10 18 25424349 74.722 3.37135e-08 Chr9 0 24383978 +inf 0 Chr7 843 30823984 58.8527 1.30234e-06 Chr5 0 31617896 +inf 0 Chr3 0 39382534 +inf 0 hifi 3710 386182246 63.3964 4.57472e-07

there is no error when merqury running. maybe assembly quality is high enough ?

arangrhie commented 2 years ago

Hello, you get +inf when there are no error kmers found. If you are using HiFi kmers to evaluate an HiFi assembly, this could happen. If you have Illumina reads, I'd recommend to try that as well. See the T2T-polishing paper which describes more about this bias.

HaoWangLYL commented 2 years ago

Hello, you get +inf when there are no error kmers found. If you are using HiFi kmers to evaluate an HiFi assembly, this could happen. If you have Illumina reads, I'd recommend to try that as well. See the T2T-polishing paper which describes more about this bias. thanks for your reply.and there I'd like to ask you for some advice about the QV evalution. Should I filter the reads from mitochondria and chloroplasts or bacterial pollution before merquery running?

arangrhie commented 1 year ago

@HaoWangLYL , sorry I missed your next question! QV isn't affected by the extra stuff in the read kmer db. Bacterial pollution / mitochondria / chloroplasts may be observed as 'missing' if it doesn't exist in the assembly and could (slightly) hurt your completeness measure. Those occurring at high frequency in the read kmers (ex. mitochondria or chloroplasts) wouldn't be seen in the spectra-cn plots though. I usually don't filter out mitochondrial dnas as it is part of the genome and gets assembled in most cases.