Open Aannaw opened 2 years ago
Nextdenovo is known to produce smaller assemblies. It gets a 2.9Gb human CHM13 assembly but the real size should be 3.05Gb. Hifiasm gets a 3.04Gb assembly instead, much closer to the truth. Nextdenovo will have more problems given more repetitive genomes.
What is the extra 800M?
Mostly centromeric repeats and some segmental duplications.
Hi ~ how should we deal with the extra 800M unanchored to improve anchoring rate (on HiFi)? Can we remove it for subsequent analysis?
Is this because the extra 800M is generated due to HiFi sequencing more precise than Nanopore
Given the nature of CCS sequencing, I suspect it will produce more repeats. But I haven't seen a comparison of how HiFi and Nanopore behave on the same genome
Hello,Professor I have
HiFi + HiC, Nanopore + HiC data
The draft assembly ofHiFi data
was finished byHifiasm
, draft assembly ofNanopore data
was finished byNextdenova
and the resulting draft assemblies differ by about 800M.(HiFi : 3.3G, Nanopore: 2.5GM)
Then I usejuicer + 3ddna
to anchor the two draft genomes to scaffold genomes. Finally I got similar anchor results for the two draft genomes:32 pseud-chromosomes
were anchored and around more than 2.4G was anchored. But the anchor rate of two genomes is quite different:HiFi:2.4G/3.3G=72%;Nanopore:2.4G/2.5G=96%
. It seems that the low anchor rate of HiFi draft genome is due to more hifi sequences than nexdenovo and hic sequences.I have 8 genomes, 2 are HiFi and 6 are Nanopore. The results are consistent with the above. I can now determine that the chromosome size of my species is 2.4~2.5G,because the results for the eight genomes are the same
In the draft assembly, the extra 800M of HiFi is also the part that cannot be anchor on the chromosome. In the heat map, there is no interaction matrix in this part.
What is the extra 800M? HiFi + HiC heatmap Nanopore + HiC heatmap