PalMuc / TransPi

TransPi – a comprehensive TRanscriptome ANalysiS PIpeline for de novo transcriptome assembly
Other
26 stars 14 forks source link

Error executing process > 'transdecoder_predict #54

Closed AlexGaithuma closed 1 year ago

AlexGaithuma commented 1 year ago

Hi, Can't complete the pipeline. I get an error.

executor >  local (27)
[15/32ba1b] process > fasqc (reads_R)                [100%] 1 of 1 ✔
[c5/32f96c] process > fastp (reads_R)                [100%] 1 of 1 ✔
[66/7a33a2] process > fastp_stats (reads_R)          [100%] 1 of 1 ✔
[e8/129b34] process > skip_rrna_removal (reads_R)    [100%] 1 of 1 ✔
[f5/a101c4] process > normalize_reads (reads_R)      [100%] 1 of 1 ✔
[87/72b21c] process > trinity_assembly (reads_R)     [100%] 1 of 1 ✔
[f9/8c4bb6] process > soap_assembly (reads_R)        [100%] 1 of 1 ✔
[df/5bc06b] process > velvet_oases_assembly (read... [100%] 1 of 1 ✔
[af/f7eb44] process > rna_spades_assembly (reads_R)  [100%] 1 of 1 ✔
[29/3cd2a4] process > transabyss_assembly (reads_R)  [100%] 1 of 1 ✔
[2d/50f97d] process > evigene (reads_R)              [100%] 1 of 1 ✔
[a2/351831] process > rna_quast (reads_R)            [100%] 1 of 1 ✔
[c2/20f289] process > mapping_evigene (reads_R)      [100%] 1 of 1 ✔
[0e/7dc7d3] process > busco4 (reads_R)               [100%] 1 of 1 ✔
[79/5b9fb2] process > mapping_trinity (reads_R)      [100%] 1 of 1 ✔
[d2/e106b8] process > summary_evigene_individual ... [100%] 1 of 1 ✔
[76/2e3e91] process > busco4_tri (reads_R)           [100%] 1 of 1 ✔
[93/ed6f05] process > skip_busco_dist (reads_R)      [100%] 1 of 1 ✔
[04/eea804] process > summary_busco4_individual (... [100%] 1 of 1 ✔
[4f/bc6544] process > get_busco4_comparison (read... [100%] 1 of 1 ✔
[45/016140] process > transdecoder_longorf (reads_R) [100%] 1 of 1 ✔
[f7/00b444] process > transdecoder_diamond (reads_R) [100%] 1 of 1 ✔
[1c/a4f2d9] process > transdecoder_hmmer (reads_R)   [100%] 1 of 1 ✔
[bf/38e7f7] process > transdecoder_predict (reads_R) [  0%] 0 of 1
[-        ] process > swiss_diamond_trinotate        -
[-        ] process > custom_diamond_trinotate       -
[-        ] process > hmmer_trinotate                -
[-        ] process > skip_signalP                   -
[-        ] process > skip_tmhmm                     -
[04/68d329] process > skip_rnammer (reads_R)         [100%] 1 of 1 ✔
[-        ] process > trinotate                      -
[-        ] process > get_GO_comparison              -
[-        ] process > summary_custom_uniprot         -
[-        ] process > skip_kegg                      -
[0e/d7d872] process > get_transcript_dist (reads_R)  [100%] 1 of 1 ✔
[-        ] process > summary_transdecoder_indivi... -
[-        ] process > summary_trinotate_individual   -
[-        ] process > get_report                     -
[f9/cefe7e] process > get_run_info                   [100%] 1 of 1 ✔
Error executing process > 'transdecoder_predict (reads_R)'

Caused by:
  Process `transdecoder_predict (reads_R)` terminated with an error exit status (1)

Command executed:

  ass=$( echo reads_R.combined.okay.fa reads_R.pfam.domtblout reads_R.diamond_blastp.outfmt6 | tr " " "\n" | grep -v ".diamond_blastp.outfmt6" | grep -v ".pfam.domtblout" | grep ".fa" )
  dia=$( echo reads_R.combined.okay.fa reads_R.pfam.domtblout reads_R.diamond_blastp.outfmt6 | tr " " "\n" | grep ".diamond_blastp.outfmt6" )
  pfa=$( echo reads_R.combined.okay.fa reads_R.pfam.domtblout reads_R.diamond_blastp.outfmt6 | tr " " "\n" | grep ".pfam.domtblout" )

  echo -e "\n-- TransDecoder.LongOrfs... --\n"

  TransDecoder.LongOrfs -t ${ass} --output_dir reads_R.transdecoder_dir -G Universal

  echo -e "\n-- Done with TransDecoder.LongOrfs --\n"

  echo -e "\n-- TransDecoder.Predict... --\n"

  TransDecoder.Predict -t ${ass} --retain_pfam_hits ${pfa} --retain_blastp_hits ${dia} --output_dir reads_R.transdecoder_dir -G Universal

  echo -e "\n-- Done with TransDecoder.Predict --\n"

  echo -e "\n-- Calculating statistics... --\n"
  #Calculate statistics of Transdecoder
  echo "- Transdecoder (long, with homology) stats for reads_R" >reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep -c ">" )
  echo -e "Total number of ORFs: $orfnum \n" >>reads_R_transdecoder.stats
  echo -e "\t Of these ORFs" >>reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep ">" | grep -c "|" )
  echo -e "\t\t with annotations: $orfnum \n" >>reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep ">" | grep -v "|" | grep -c ">" )
  echo -e "\t\t no annotation: $orfnum \n" >>reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep -c "ORF type:complete" )
  echo -e "\t ORFs type=complete: $orfnum \n" >>reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep "ORF type:complete" | grep -c "|" )
  echo -e "\t\t with annotations: $orfnum \n" >>reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep -c "ORF type:5prime_partial" )
  echo -e "\t ORFs type=5prime_partial: $orfnum \n" >>reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep "ORF type:5prime_partial" | grep -c "|" )
  echo -e "\t\t with annotations: $orfnum \n" >>reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep -c "ORF type:3prime_partial" )
  echo -e "\t ORFs type=3prime_partial: $orfnum \n" >>reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep "ORF type:3prime_partial" | grep -c "|" )
  echo -e "\t\t with annotations: $orfnum \n" >>reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep -c "ORF type:internal" )
  echo -e "\t ORFs type=internal: $orfnum \n" >>reads_R_transdecoder.stats
  orfnum=$( cat reads_R*.transdecoder.pep | grep "ORF type:internal" | grep -c "|" )
  echo -e "\t\t with annotations: $orfnum \n" >>reads_R_transdecoder.stats
  # csv for report
  echo "Sample,Total_orf,orf_complete,orf_5prime_partial,orf_3prime_partial,orf_internal" >reads_R_transdecoder.csv
  total=$( cat reads_R*.transdecoder.pep  | grep -c ">" )
  complete=$( cat reads_R*.transdecoder.pep  | grep -c "ORF type:complete" )
  n5prime=$( cat reads_R*.transdecoder.pep  | grep -c "ORF type:5prime_partial" )
  n3prime=$( cat reads_R*.transdecoder.pep  | grep -c "ORF type:3prime_partial" )
  internal=$( cat reads_R*.transdecoder.pep  | grep -c "ORF type:internal" )
  echo "reads_R,${total},${complete},${n5prime},${n3prime},${internal}" >>reads_R_transdecoder.csv
  echo -e "\n-- Done with statistics --\n"

  mv ${ass} reads_R_assembly.fasta

  echo -e "\n-- DONE with TransDecoder --\n"

Command exit status:
  1

Command output:

  -- TransDecoder.LongOrfs... --

  CMD: touch reads_R.transdecoder_dir.__checkpoints_longorfs/TD.longorfs.ok

  -- Done with TransDecoder.LongOrfs --

  -- TransDecoder.Predict... --

  -- Done with TransDecoder.Predict --

  -- Calculating statistics... --

Command error:
  PFAM output found (reads_R.pfam.domtblout) and processing...

  * Running CMD: /usr/local/opt/transdecoder/util/train_start_PWM.pl --transcripts reads_R.combined.okay.fa --selected_orfs reads_R.transdecoder_dir/longest_orfs.cds.top_500_longest --out_prefix reads_R.transdecoder_dir/start_refinement
  Training start codon pattern recognition* Running CMD: /usr/local/opt/transdecoder/util/PWM/build_atgPWM_+-.pl  --transcripts reads_R.combined.okay.fa  --selected_orfs reads_R.transdecoder_dir/longest_orfs.cds.top_500_longest  --out_prefix reads_R.transdecoder_dir/start_refinement --pwm_left 20 --pwm_right 10 
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/feature_scoring.+-.pl  --features_plus reads_R.transdecoder_dir/start_refinement.+.features  --features_minus reads_R.transdecoder_dir/start_refinement.-.features  --atg_position 20  > reads_R.transdecoder_dir/start_refinement.feature.scores
  -round: 1
  -round: 2
  -round: 3
  -round: 4
  -round: 5
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/feature_scores_to_ROC.pl reads_R.transdecoder_dir/start_refinement.feature.scores > reads_R.transdecoder_dir/start_refinement.feature.scores.roc
  -parsing scores
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/plot_ROC.Rscript reads_R.transdecoder_dir/start_refinement.feature.scores.roc || :
  env: can't execute 'Rscript': No such file or directory
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/compute_AUC.pl reads_R.transdecoder_dir/start_refinement.feature.scores.roc
  Can't exec "Rscript": No such file or directory at /usr/local/opt/transdecoder/util/PWM/compute_AUC.pl line 82.
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/make_seqLogo.Rscript reads_R.transdecoder_dir/start_refinement.+.pwm || :
  env: can't execute 'Rscript': No such file or directory
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/make_seqLogo.Rscript reads_R.transdecoder_dir/start_refinement.-.pwm || :
  env: can't execute 'Rscript': No such file or directory
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/deplete_feature_noise.pl  --features_plus reads_R.transdecoder_dir/start_refinement.+.features  --pwm_minus reads_R.transdecoder_dir/start_refinement.-.pwm  --out_prefix reads_R.transdecoder_dir/start_refinement.enhanced
  num features: 25  num_incorporate: 7
  -num feature swaps: 0
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/feature_scoring.+-.pl  --features_plus reads_R.transdecoder_dir/start_refinement.enhanced.+.features  --features_minus reads_R.transdecoder_dir/start_refinement.-.features  --atg_position 20  > reads_R.transdecoder_dir/start_refinement.enhanced.feature.scores
  -round: 1
  -round: 2
  -round: 3
  -round: 4
  -round: 5
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/feature_scores_to_ROC.pl reads_R.transdecoder_dir/start_refinement.enhanced.feature.scores > reads_R.transdecoder_dir/start_refinement.enhanced.feature.scores.roc
  -parsing scores
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/plot_ROC.Rscript reads_R.transdecoder_dir/start_refinement.enhanced.feature.scores.roc || :
  env: can't execute 'Rscript': No such file or directory
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/compute_AUC.pl reads_R.transdecoder_dir/start_refinement.enhanced.feature.scores.roc
  Can't exec "Rscript": No such file or directory at /usr/local/opt/transdecoder/util/PWM/compute_AUC.pl line 82.
  * Running CMD: /usr/local/opt/transdecoder/util/PWM/make_seqLogo.Rscript reads_R.transdecoder_dir/start_refinement.enhanced.+.pwm || :
  env: can't execute 'Rscript': No such file or directory
  * Running CMD: /usr/local/opt/transdecoder/util/start_codon_refinement.pl --transcripts reads_R.combined.okay.fa --gff3_file reads_R.transdecoder_dir/longest_orfs.cds.best_candidates.gff3 --workdir reads_R.transdecoder_dir > reads_R.transdecoder_dir/longest_orfs.cds.best_candidates.gff3.revised_starts.gff3
  Refining start codon selections.
  -number of revised start positions: 1
  * Running CMD: cp reads_R.transdecoder_dir/longest_orfs.cds.best_candidates.gff3.revised_starts.gff3 reads_R.combined.okay.fa.transdecoder.gff3
  copying output to final output file: reads_R.combined.okay.fa.transdecoder.gff3* Running CMD: /usr/local/opt/transdecoder/util/gff3_file_to_bed.pl reads_R.combined.okay.fa.transdecoder.gff3 > reads_R.combined.okay.fa.transdecoder.bed
  Making bed file: reads_R.combined.okay.fa.transdecoder.bed
  * Running CMD: /usr/local/opt/transdecoder/util/gff3_file_to_proteins.pl --gff3 reads_R.combined.okay.fa.transdecoder.gff3 --fasta reads_R.combined.okay.fa  --genetic_code Universal > reads_R.combined.okay.fa.transdecoder.pep
  Making pep file: reads_R.combined.okay.fa.transdecoder.pep
  * Running CMD: /usr/local/opt/transdecoder/util/gff3_file_to_proteins.pl --gff3 reads_R.combined.okay.fa.transdecoder.gff3 --fasta reads_R.combined.okay.fa --seqType CDS  --genetic_code Universal > reads_R.combined.okay.fa.transdecoder.cds
  Making cds file: reads_R.combined.okay.fa.transdecoder.cds
  transdecoder is finished.  See output files reads_R.combined.okay.fa.transdecoder.*

Work dir:
  /scratch/user/akiarieg/work/bf/38e7f71b375f70383a303e6bcddb27

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
AlexGaithuma commented 1 year ago

This error was due to a very small dataset. I re-run it with a larger dataset --->wget http://genomedata.org/rnaseq-tutorial/fasta/GRCh38/chr22_with_ERCC92.fa

and it worked fine