chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
545 stars 87 forks source link

Hifi + hiC assembly stuck at max_n_chain to 100 #558

Open gushiro opened 12 months ago

gushiro commented 12 months ago

I have a large genome (~5GB) with N50 ~3Mb, and 30X of Hi-C coverage (uniquely mapped reads) The assembly is stuck at <[M::ha_opt_update_cov] updated max_n_chain to 100> for two days. I moved to assemble the HiFi reads alone without the Hi-C, which I used only for scaffolding.

Any thoughts on whether I should just let it run or if there is some internal error I can fix here?

PD: the input Hi-C reads are the raw reads (total read pairs)

Writing processed unitig GFA to disk... 
[M::purge_dups] homozygous read coverage threshold: 30
[M::purge_dups] purge duplication coverage threshold: 37
[M::mc_solve:: # edges: 6760]
[M::mc_solve_core_adv::0.757] ==> Partition
[M::adjust_utg_by_primary] primary contig coverage range: [25, infinity]
Writing olaqueousGenome_hicmode_homoPeak30.asm.hic.p_ctg.gfa to disk... 
[M::ha_opt_update_cov] updated max_n_chain to 100
chhylp123 commented 12 months ago

Is hifiasm still running with enough memory? I am wondering if there is no enough memory for hifiasm.

gushiro commented 11 months ago

it finally finished after ~4 days of being stuck. Looks fine so far

zilov commented 9 months ago

Got same behaviour on small plant genome (~200Mbp), stuck at the same step and using ~500gb of RAM. HiFi reads 90x coverage and HiC reads 60x coverage. Running already for two days :(

chhylp123 commented 9 months ago

@zilov Sorry for the late reply since I was too busy during the last few weeks. Could you please have a try with ‘--s-base -1’? This option will disable base-level homology detection, which might take a large amount of memory. By the way, is it possible that you can share the bin files with me? I just want to have a look why it takes such a huge memory.