chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
526 stars 86 forks source link

Difference between hap1 and hap2's N50 gets larger when reads are subsampled #314

Open LYC-vio opened 1 year ago

LYC-vio commented 1 year ago

Hi,

Thank you very much for developing this amazing tool!

Recently, I have tried to run hifiasm (v0.16.1) on CCS reads that subsampled to different coverage with rasusa, and noticed that the difference between hap1 and hap2's N50 got larger when reads were subsampled (assembly evaluated with Quast). Is this an expected behavior? Or this could be caused by the subsampling tool (rasusa)?

Original reads (~56x NA24385 CCS) assembly N50: hap1: 67990447 hap2: 55015309

subsampled to 50x: hap1: 80331069 hap2: 48577680

40x: hap1: 61310474 hap2: 38985709

30x: hap1: 50351921 hap2: 33726358

20x: hap1: 11535826 hap2: 6595624

for 50x~20x, seems that the N50 of hap2 are around 60% of hap1 N50

the command used to run hifiasm was hifiasm -o NA24385.asm -t32 ${reads}

Thank you Best

chhylp123 commented 1 year ago

Thanks. N50s of 56x and 50x sound reasonable to me. Probably 50x is already reach the capacity of HiFi reads.