BGI-Qingdao / TGS-GapCloser

A gap-closing software tool that uses long reads to enhance genome assembly.
GNU General Public License v3.0
179 stars 13 forks source link

scaffold become truncated after TGSGapCandidate #81

Open smallfishcui opened 7 months ago

smallfishcui commented 7 months ago

Hi,

I had the same issue with #80. I was trying to close the gaps of a chromosome. The size of the total length of the chromosome is around 30Mbp.However the resulted PREFIX.ont.fasta after TGSGapCandidate contains only telomere sequences, and the size become 23Kbp. Here is my command : tgsgapcloser --scaff PaChr16.fa --racon /opt/racon/build/bin/racon --thread 40 --output PPaChr16.gapclose.fa --tgstype pb --chunk 40 --min_match 1kb --minmap_arg '-x ava-pb' --reads combined.fasta >pipe.log 2>pipe.err I am doing the same to other chromosomes but the others seem to be fine.

Here is part of the log file in PREFIX .cand.log: TGSGapCandidate INFO EET 2024/4/9 1:4:23 : TGSGapCandidata start now ... TGSGapCandidate INFO EET 2024/4/9 1:4:23 : LoadONTReads start now ... TGSGapCandidate INFO EET 2024/4/9 1:6:51 : >total load ONT reads : 3381271 TGSGapCandidate INFO EET 2024/4/9 1:6:51 : LoadONTReads finish. used wall clock : 148 seconds, cpu time : 136.201447 seconds TGSGapCandidate INFO EET 2024/4/9 1:6:51 : LoadPAF start now ... TGSGapCandidate INFO EET 2024/4/9 1:9:33 : >the read2contig freq is 1 677720 2 1504439 3 327

TGSGapCandidate INFO EET 2024/4/9 1:9:33 : >the contig2read_num_freq freq is 327 1 1798079 1 1889173 1

TGSGapCandidate INFO EET 2024/4/9 1:9:33 : >the contig2a_read_freq freq is 1 1076124 2 709410 3 484785 4 358703 5 260575 6 188220 7 143513 8 116317 9 91090 10 67811 11 49338 12 35427

Do you know why?

thanks, Cui

adonis316 commented 7 months ago

It seems like there are only very few long reads that can be mapped to the gap region of the input scaffolds. Please check whether the input long reads can be mapped to the gap regions using minimap2 or other long-read mappers.

You might change the minimap2 setting, for example, using ava-ont, to allow more long reads to be mapped to this gap region.

Thanks, Mengyang