chhylp123 / hifiasm

Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
MIT License
530 stars 87 forks source link

hifiasm issue with tetraploid genome assembly #607

Open RezwanCAAS opened 8 months ago

RezwanCAAS commented 8 months ago

Hi, I am using the allotetraploid plant species to assemble the genome. I have the 3 cells PacBio HiFi reads data to assemble the genome. I ran the code as given following using hifiasm/0.19.8 version

hifiasm -o axm_assembly -t 32 --n-hap 4 axmcell*

this code output is two hap1 and hap2 files, but I was expecting should have 4 hap files as per --n-hap 4 function.

Later i ran default code for hifiasm and this code produced same two hap1 and hap2 as I was expecting to have two hap files. hifiasm -o axm_assembly -t 32 axmcell*

is there something wrong here? or which code should I use for tetraploid? Please mention some suggestions for this. Looking forward.

tallnuttrbgv commented 7 months ago

I have same issue with --n-hap 3. Only get two haploid genomes.

RezwanCAAS commented 7 months ago

@tallnuttrbgv I have an idea from the previous publications. Just run the hifiasm with the default settings, e.g., --n-hap 2. Then use your primary assembly and perform the chromosomal scaffolding with Hi-C. RUN EDTA for repeating elements. Then collect a particular major repeating elements covering your all chromosomes. Afterward, perform the alignment among them and clustering will separate the subgenomes.

tallnuttrbgv commented 7 months ago

Unfortunately we will not have HiC for this data. Also HiC will not scaffold if homologous chromosomes have been concatenated as in our case. I need to manually break those contigs.

chhylp123 commented 7 months ago

Sorry for the late reply since I was too busy during the last a few weeks. Right now only the Hi-C module could output more than 2 haplotypes. We would like to apply it to the non-Hi-C module soon as well.