Open NeMeh21 opened 5 years ago
Sorry for the delay on this. I missed the email notification. 1) Do you think you have enough data for a complete assembly? 2) What does the current assembly look like (length? total contigs?) 3) What lineage are you working in?
Hi, sorry for jumping in. How can I check if I have enough data for assembly?
Depends on the sample but if you coverage is at least 20X, usually you can get plastome. Issues can arise if you are using by-catch from sequence capture or if you have too much data. There can also be an issues if the assembly breaks in the middle of a single copy region or if the assembly loops around the whole plastome more than 1x. If you suspect the former, we will have to chat about how to overcome it. If you have the former, see the troubleshooting section I am adding.
Hi, I encountered same problem. The mentioned NAME_afin_iter2.fa contains a single contig and its length is 115,119 bp. Here are some information may help.
16667700 reads; of these: 16492352 (98.95%) were paired; of these: 15994654 (96.98%) aligned concordantly 0 times 321885 (1.95%) aligned concordantly exactly 1 time 175813 (1.07%) aligned concordantly >1 times
15994654 pairs aligned concordantly 0 times; of these: 85036 (0.53%) aligned discordantly 1 time ---- 15909618 pairs aligned 0 times concordantly or discordantly; of these: 31819236 mates make up the pairs; of these: 31640149 (99.44%) aligned 0 times 27500 (0.09%) aligned exactly 1 time 151587 (0.48%) aligned >1 times
175348 (1.05%) were unpaired; of these: 168512 (96.10%) aligned 0 times 3746 (2.14%) aligned exactly 1 time 3090 (1.76%) aligned >1 times 4.08% overall alignment rate
Use of uninitialized value $final_start in hash element at /data/00/user/user103/software/05.genomic/Fast-Plast/bin/sequence_based_ir_id.pl line 120, <$file> line 2. Use of uninitialized value $final_end in hash element at /data/00/user/user103/software/05.genomic/Fast-Plast/bin/sequence_based_ir_id.pl line 120, <$file> line 2. Argument "" isn't numeric in sort at /data/00/user/user103/software/05.genomic/Fast-Plast/bin/sequence_based_ir_id.pl line 123, <$file> line 2. Argument "" isn't numeric in subtraction (-) at /data/00/user/user103/software/05.genomic/Fast-Plast/bin/sequence_based_ir_id.pl line 125, <$file> line 2.
I'm going to construct the phylogenetic tree with the chloroplast genomes. I'm wondering whether the completeness could affect the phylogenetic result or I can just use the afin result? Hope above information could help! Thank you a lot!
Best, Shangzhe
Hi Shangzhe,
Based on what you sent me, my guess is that the scripts to filter out the mitochondrial contamination overfiltered. I say this because your contig of 115,119 bp is about 15K bp too short for what I would expected based on your close relative. This can happen with high coverage. Check the spades contig file for a missing piece, probably the small single copy (~15-20kb) that has a similar coverage to a piece that I suspect is ~80kb. You can pull that contig into the filtered contigs file and rerun afin (see tutorial). Let me know if this doesn't help.
Best, Michael
Hi Michael,
Thanks for your reply. Sorry I didn't find out how to just rerun afin and following steps. Can you tell me more details? By the way I checked the discarded contigs. There was a contig NODE_3_length_18798_cov_18.7286. Should I add it to the filtered contig file?
Best, Shangzhe
Hi Shangzhe,
Check out this page for running afin: https://github.com/afinit/afin. You will want to use the filtered spades contigs file with the other contig you found. For reads, use all the trimmed reads that Fast-Plast produced.
After you do that, look at the troubleshooting.md file on the Fast-Plast page to see what to do next.
Let me know if you still have trouble.
Best, Michael
Hi there! I have been getting the error while completion of final assembly on two separate data-sets. The error is as following "Checking coverage of final assembly. Final assembly is the last afin iteration. 79.0123456790123% of known angiosperm chloroplast genes were recovered in FP-L5_afin_iter2.fa. Could not properly orientate the plastome. Either your plastome does not have an IR or there was an issue with the assembly." How can I proceed further or what changes may I need to make for the completion of final assembly? Thanks.