mrmckain / Fast-Plast

Automated de novo assembly of whole chloroplast genomes.
MIT License
43 stars 14 forks source link

file chloroplast_gene_composition_of_afin_contigs.txt is 0 bytes #30

Closed sagrikachugh closed 5 years ago

sagrikachugh commented 5 years ago

Hello !! I am trying to assemble chloroplast genome of an algae using Fast-Plast. Its a paired end data. After complete run I receive 6 directories in the output namely

  1. Trimmed_Reads
  2. Bowtie_Mapping
  3. Spades_Assembly
  4. Afin_Assembly
  5. Plastome_finishing
  6. Final_Assembly

My query is that the files in FInal_Assembly "Chloroplast_gene_composition_of_afin_contigs.txt" and "pk8_fastplast_afin_iter2.fa_positional_genes.blastn" are of 0 bytes whereas "name_fastplast_afin_iter2" is 160.9kb.

The last few lines of error log are : sh: line 1: 10440 Segmentation fault /home/sagrika/tools/Fast-Plast-master/bin/ncbi-blast-2.6.0+/bin/blastn -query pk11_fastplast_regions_split3.fsa -db pk11_fastplast_regions_split3.fsa -evalue 1e-40 -outfmt 6 -max_target_seqs 1000000 > pk11_fastplast_regions_split3.fsa.blastn sh: line 1: 10443 Segmentation fault /home/sagrika/tools/Fast-Plast-master/bin/ncbi-blast-2.6.0+/bin/blastn -query Final_Assembly/pk11_fastplast_afin_iter2.fa -db Angiosperm_Chloroplast_Genes -evalue 1e-40 -outfmt 6 -max_target_seqs 1000000 > Final_Assembly/pk11_fastplast_afin_iter2.fa_positional_genes.blastn Died at fast-plast.pl line 876.

Please guide me how to proceed

sagrikachugh commented 5 years ago

I reinstalled the tool and tried again its still dying at line 876. The "Chloroplast_gene_composition_of_afin_contigs.txt" gave information about 6 genes and "pk_fastplast_afin_iter2.fa_positional_genes.blastn" is of 94.9kb whereas "name_fastplast_afin_iter2" is 160.9kb (same as earlier)

progress log last lines : Fri Mar 29 14:03:14 2019 Checking chloroplast gene recovery in contigs. Checking coverage of final assembly. Final assembly is the last afin iteration. 7.40740740740741% of known angiosperm chloroplast genes were recovered in pk12_fastplast_afin_iter2.fa. Could not properly orientate the plastome. Either your plastome does not have an IR or there was an issue with the assembly. Best contigs are in /home/sagrika/Fast-Plast/pk12_fastplast/Final_Assembly/pk12_fastplast_afin_iter2.fa. A list of genes in each contig can be found in "Chloroplast_gene_composition_of_final_contigs.txt".

error log : 3.40% overall alignment rate Died at fast-plast.pl line 876.

mrmckain commented 5 years ago

Sorry for the delay. I missed the emails for these issues.

Fast-Plast was built around a lot of angiosperm expectations including the angiosperm genes use to help orientate the regions and ID them. Depending on the algal lineage you are doing, some of these expectations might fail. Most algae (that we have cp genomes for) do not IRs. That is an expectation in the finishing steps. Based on above, Fast-Plast went until it reaches the message: "Could not properly orientate the plastome. Either your plastome does not have an IR or there was an issue with the assembly. Best contigs are in /home/sagrika/Fast-Plast/pk12_fastplast/Final_Assembly/pk12_fastplast_afin_iter2.fa. A list of genes in each contig can be found in "Chloroplast_gene_composition_of_final_contigs.txt". It thens dies, hence Died at line 876.

I would look in name_fastplast_afin_iter2 or the other progress output to see how many contigs were made. Based on the potential genomic structure of your taxa, you can figure out if you have a complete or incomplete sequence.

Best, Michael