Nextomics / NextPolish

Fast and accurately polish the genome generated by long reads.
GNU General Public License v3.0
213 stars 28 forks source link

The process of polishing causes a significant decrease in genome size. #130

Open xujialupaoli opened 7 months ago

xujialupaoli commented 7 months ago

Hello, thank you for providing such a useful assembly software. I have a question and would like your help. After nextpolish polished my genome file, my genome size was reduced a lot. Why does polishing cause a significant decrease in genome size? In the picture, I sorted out the genome size assembled by nextdenovo, the genome size after purge_dups, and the genome size after nextpolish. image

moold commented 7 months ago

This may be caused by nextpolish not finishing running.,Could you paste its running log to here?

xujialupaoli commented 7 months ago

Thank you very much for your reply! first_polish_log.txt second_polish_log.txt I first used quality-filtered ultralong ONT reads and common ONT reads to polish the assembly for three rounds, setting ‘task = best rewrite = yes rerun = 3’ in the parameter config file. The assembly polished by ONT data was then further polished for three rounds with PacBio HiFi reads,setting‘task = best rewrite = yes rerun = 3’ in the parameter config file.

moold commented 7 months ago

It is better to use NextPolish2 to polish your genome (assembly polished by ONT data) with HiFi reads.

xujialupaoli commented 7 months ago

Thanks for your reply and help! The genome assembly I assembled through hifiasm using HiFi data is not very good, so I am currently using the results assembled by Nextdenovo, which is better. My data is a bit special, and there should be more errors in the results I assembled using nextdovo. I have carefully learned about nextpolish2, which you developed. It is suitable for polishing high-quality assembly . Is nextpolish more suitable for my situation? In addition, I hope to use my HiFi sequencing data for polishing. Because it is difficult to use HiFI data for assembly, I hope to use my hifi data in the polishing process. Finally,why does polishing result in a significant decrease in genome size?

moold commented 7 months ago
  1. nextpolish2 is suitable for your assembly, so you can have a try.
  2. There may be an unknown bug, but it cannot be fixed unless I can reproduce this bug.
xujialupaoli commented 7 months ago

Thank you for your reply and help! I will try using Nextpolish2 to see if it resolves the issue. Could it be that my HiFi data is somewhat unique, leading to some unknown issues when applying it to Nextpolish? Because I didn't encounter any significant reduction in genome size when processing with ONT data. image

moold commented 7 months ago

I'm not sure, unless I can reproduce the error.

xujialupaoli commented 7 months ago

I performed polishing through nextpolish2 and found that it still resulted in a significant reduction in the genome, by about 30M. Do you know what causes this situation? Very much looking forward to your reply image

lijphd168866 commented 1 month ago

Hello, has your problem been solved? I'm having the same problem as you and look forward to hearing from you; Thank you!

xujialupaoli commented 1 month ago

您好,您的问题解决了吗?我也遇到了同样的问题,期待您的回复;谢谢!

not yet

lijphd168866 commented 1 month ago

您好,您的问题解决了吗?我也遇到了同样的问题,期待您的回复;谢谢!

not yet

Thank you so much for replying to me so quickly, can I consult how you did to polish in the end?