marbl / merqury

k-mer based assembly evaluation
Other
272 stars 19 forks source link

Low or incomplete k-mer completeness? #103

Closed GuillaumeHolley closed 1 year ago

GuillaumeHolley commented 1 year ago

Hi,

I have run Merqury 1.3 for a dual assembly in trio mode. I am particularly interested in the QV and k-mer completness. Unfortunately, I gave Merqury's job 24h to run (with 64 cores available) and Merqury couldn't complete in that time frame. However, many results have finished computing.

The Quality Value is more or less what was expected, around 53. However, the file *.completeness.stats looks like this:

H2  mat.illumina    2231893994  12638158046 17.66
H2  pat.illumina    2217837019  11500812682 19.2842
H1  mat.illumina    2117548527  12638158046 16.7552
H1  pat.illumina    2141181620  11500812682 18.6177
both    mat.illumina    2266950371  12638158046 17.9373
both    pat.illumina    2274951564  11500812682 19.7808

I would like to get the k-mer completeness of both haplotype assemblies with respect to both read sets so a line is missing at the end? Also these numbers seem strange, they show a low k-mer completeness wrt each haplotype?

Thank you for your time. Guillaume

arangrhie commented 1 year ago

Hello Guillaume,

Apologies for the delay, somehow I missed this.

Something is weird here. Have you run hapmers.sh on the parental meryl dbs? It looks like the entire parental read meryl db has been given to Merqury. Or is your genome at 11-12Gb?

-Arang

GuillaumeHolley commented 1 year ago

Hi @arangrhie,

I also apologize about the delay, it took me some time to get back to this issue. I am fairly certain I did not run hapmers.sh on the parent meryl DBs and I wonder how I could have missed that. I am computing those as we speak. I'll be closing the issue in the meantime so thank you for your feedback!

Guillaume