BGI-Qingdao / TGS-GapCloser

A gap-closing software tool that uses long reads to enhance genome assembly.
GNU General Public License v3.0
179 stars 13 forks source link

some questions about TGS-GapCloser #16

Closed longzhangnation closed 4 years ago

longzhangnation commented 4 years ago

Hi, I have used the TGS-GapCloser for my genome gapfilling , and I got some questions while using it . Firsrt , I use about 15X sequcening depth data for gap filling , then I check the Busco , noticing the result of the BUSCO gets better . And In fact I have about 100X nanopore data for genome assembly , do you think will the result be much better if I use all of the sequencing data rather than 15X ? Or about 40X depth will get the best result ? Have you done the test ? Second , I noticed that it is possible to use pilon for NGS correction for the software . I have NGS data , too ,But I want to pilon the result by myself . So can I use Illumina data to pilon my genome by myself after TGS-GapCloser , rather than by the software automatically . Or the software will correct the TGS data with NGS sequence ? Hope you can give me some advice . Thank you .

adonis316 commented 4 years ago

Hi,

  1. We have seen better improvement in BUSCO results after gap filling with more ONT or PacBio long reads (up to 40X) as a result of the improved assembly completeness due to more complementary information. However, the performance of gap filling depends on both the sequencing coverage of long reads and the quality of input scaffolds. We cannot guarantee a linear improvement with more ONT reads since the misassembled gaps cannot be closed.

Another problem is the computing time using such a high coverage of long reads. I would suggest that you could split your data into 3~5 groups and literately close the gaps with one group. It is necessary if your genome is large and the built-in error correction is required.

  1. Of course, you can polish your assembly after gap filling. We provide three modes: polish long reads with NGS data by pilon, polish long reads themselves by racon, and do not polish. The parameter “--ne” will cease correction. Note that the built-in error correction only corrects the segments of long reads mapped to the gap regions.

Hope it would help.

Thanks, Mengyang

longzhangnation commented 4 years ago

For the second reply , the software will pilish the long reads with NGS data using pilon ,rather than polish the gap-filled genome . Am I right ?

adonis316 commented 4 years ago

The software first selects several long reads that could be mapped to the gap region, and then only polishes the mapped segments of candidate long reads. It will not correct or polish the input assembly.

longzhangnation commented 4 years ago

OK . Thanks for you reply . I will continue my ayalysis .