Nextomics / NextPolish

Fast and accurately polish the genome generated by long reads.
GNU General Public License v3.0
205 stars 28 forks source link

Reduction in genome size after polishing #100

Closed Shenu-Hudson closed 2 years ago

Shenu-Hudson commented 2 years ago

I noticed the genome size and other assembly statistics reducing after every round of polishing. Could someone explain to me why this is happening and how to handle this?

Initial genome size: 1092812176 bp After round 1 polishing: 1092776966 (difference: 35210 bp ) After round2 polishing: 1092776883 (Round1-Round2: 83 bp)

Any help would be appreciated.

moold commented 2 years ago

Check whether the scaffold number of the raw genome and the polished genome is the same? If it is the same, that means the raw genome contains more insertion errors than deletion errors.

Shenu-Hudson commented 2 years ago

Thank you for your reply.

Scaffold numbers are the same for raw and polished genomes.

This assembly is a hybrid of Pacbio + optical mapping + Hic data. Can we trust these insertion errors?

moold commented 2 years ago

You can do some assessments, such as busco score or mapping RNA-SEQ data.