Open ASLeonard opened 3 years ago
Sorry for the late reply. Is it possible that yo can show the data at 9x coverage?
By show the data at 9x coverage, do you mean posting some summary statistics for the data, or sharing that data with you via some ftp?
Sharing that data via ftp might be better, so that we can do some debugging. I cannot make sure the exact reason for now.
I've shared the data with your listed email address
Got the email. Thanks a lot.
Just tried reassembling the same data with the the latest version, but without much improvement. The histogram still has a strange distribution (same as in #66) and the job gets killed after using 2x as much RAM as higher coverage datasets which were successful.
Please wait me a few days. I'm debugging this problem on the dataset you sent to me. I need to release v0.14 first since it has many small bugs.. Sorry for the delay.
No worries, I'm in no rush. This was just testing the limits of low coverage rather than producing a primary assembly.
Was this resolved? I am seeing the same problem on a large genome (5 Gbp) but have about 40X reads). Exactly the same strange histogram and hifiasm either times out or runs out of memory.
Hi, To preface this, I know this is not a standard use case, so I'm not expecting there to be a happy resolution.
I've been trying to test how low the coverage can go to see which metrics (auN, asmgene, QV, etc) start to break when, on a 2.7gb mammal genome.
I had an assembly complete fine down to 12x within 60 CPU hours and 60GB peak ram. The initial hist ended with these values
The next step was down to 9x coverage, which crashed after 4 CPU hours when it hit the 108GB ram I requested. The initial hist values ended with
Interestingly, the last line
[M::ha_pt_gen] count[4095] = 0 (for sanity check)
doesn't match the count of[M::ha_hist_line] 4095: 91824
.I would guess the low coverage is messing with the kmer counting, but I was surprised hifiasm worked smoothly from 40x to 12x and then totally broke at 9x. I could only find an (
-D
) for dropping frequent kmers, but couldn't see anything for dropping infrequent (maybe fewer bits for the bloom filter via-f
?).Do you think this is just a (very reasonably) coverage limit that can't be crossed, or do you think there are some settings to adjust to force a result?
Thanks, Alex