mrmckain / Fast-Plast

Automated de novo assembly of whole chloroplast genomes.
MIT License
44 stars 13 forks source link

issue with generating single contig chloroplast complete genome #64

Open dasn588 opened 2 months ago

dasn588 commented 2 months ago

Hi,

I had ~37.8 million Illumina paired end sequencing reads (Plant whole genome with chloroplast and mito genome included). I tried assembling the Chloroplast genome where around 4.39 million reads were assigned for the Chloroplast as per plastome summary file. But not able to generate single contig.

After afin, there are 2 contigs with a maximum size of 112032 and a minimum size of 15191. Fri Aug 23 12:13:12 2024 Removing nested contigs. Fri Aug 23 12:13:16 2024 Checking chloroplast gene recovery in contigs. Fri Aug 23 12:13:20 2024 Checking chloroplast gene recovery in contigs. Fri Aug 23 12:13:25 2024 Checking chloroplast gene recovery in contigs. Checking coverage of afin output with 2 contigs after contamination removal. 83.9506172839506% of known angiosperm chloroplast genes were recovered in afin_iter0.fa. Fri Aug 23 12:13:30 2024 Starting scaffolding with SSPACE. Using command /data/ngs/programs/Fast-Plast/afin/afin -c AL-1_new.final.scaffolds.fasta -r ../2_BowtieMapping/map* -l 50 -f .1 -d 100 -x 1215 -p 15 -i 2 -o afin --no_fusion. After afin, there are 2 contigs with a maximum size of 116968 and a minimum size of 21583. Fri Aug 23 12:20:37 2024 Removing nested contigs. Fri Aug 23 12:20:40 2024 Checking chloroplast gene recovery in contigs. Fri Aug 23 12:20:45 2024 Checking chloroplast gene recovery in contigs. Fri Aug 23 12:20:50 2024 Checking chloroplast gene recovery in contigs. 83.9506172839506% of known angiosperm chloroplast genes were recovered in afin_iter0.fa. Fri Aug 23 12:20:55 2024 Starting scaffolding with SSPACE. Cannot scaffold contigs into a single piece. Coverage is too low or poorly distributed across plastome.

How to resolve this issue in order to get complete chloroplast genome as a single contig.

xhx-1 commented 1 month ago

Have you solved the problem yet?

dasn588 commented 1 month ago

No still not solved. When I use the check coverage parameter for assembly purpose it generates two contigs and in the log file it shows that there is no uniform coverage across the plastome hence failed scaffolding. where as without use of check coverage parameter the tool generates single contig output but in the log file it shows the assembly may not be reliable due to coverage issue.

On Monday, September 9, 2024, xhx-1 @.***> wrote:

Have you solved the problem yet?

— Reply to this email directly, view it on GitHub https://github.com/mrmckain/Fast-Plast/issues/64#issuecomment-2337243148, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3B6RFZMFZVM6JUD4JYCHT3ZVU6UHAVCNFSM6AAAAABNJTFKJOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZXGI2DGMJUHA . You are receiving this because you authored the thread.Message ID: @.***>

mrmckain commented 1 month ago

Hi--

Sorry for the delayed response. I think that this is similar to another issue. Try subsampling your reads from 37 million to 1-2 million. That should help.

Best, Michael