harta55 / EnTAP

EnTAP is moving to GitLab for future changes https://gitlab.com/PlantGenomicsLab/EnTAP
https://entap.readthedocs.io/en/latest/
GNU General Public License v3.0
37 stars 9 forks source link

When adding a database for another run through, EnTAP doesn't like the headers it created #12

Closed selveyad closed 3 years ago

selveyad commented 4 years ago

Howdy,

I am trying to add another database to my previously completed run, but EnTAP will not read the results from the previous run and errors out after trying one transcript with this message

Error code: 140
Ensure the similarity searching finished properly and your output files are not empty.
Unable to find sequence in transcriptome: TRINITY_DN100586_c0_g1_i1.p1 from file: Vc_trinity//similarity_search/DIAMOND/blastp_Trinity_final_APD3_2020.out

Any suggestions on how I can maintain my old results for the next run? I'm not sure if the --no-trim flag would be appropriate to use here, seeing as EnTAP isn't using a grep like match, but wants the real deal, not a "trim" version.

command:

./EnTAP --runP -i Trinity.fasta.transdecoder.predict.all.cdhit3.pep -d entap_outfiles/bin/APD3_2020.dmnd -d entap_outfiles/bin/uniprot_trembl.dmnd -d entap_outfiles/bin/uniref90.dmnd -d entap_outfiles/bin/nr.dmnd --contam fungi --contam bacteria --contam viruses --taxon lepidoptera --out-dir Vc_trinity/ -t 10

grep searches for long lost hit:

grep "TRINITY_DN100586_c0_g1_i1.p1" Vc_trinity/transcriptomes/Trinity.fasta.transdecoder.predict.all.cdhit3.pep 
>**TRINITY_DN100586_c0_g1_i1.p1**TRINITY_DN100586_c0_g1~~**TRINITY_DN100586_c0_g1_i1.p1**ORFtype:completelen:119(+),score=-2.81,UniRef90_A0A2G9HV35|69.3|1.8e-37,Tryp_alpha_amyl|PF00234.23|7.6e-07,LTP_2|PF14368.7|0.0031,DUF1970|PF09301.11|0.038TRINITY_DN100586_c0_g1_i1:68-424(+)

grep "TRINITY_DN100586_c0_g1_i1.p1" Vc_trinity/transcriptomes/Trinity_final.fasta 
>**TRINITY_DN100586_c0_g1_i1.p1**TRINITY_DN100586_c0_g1~~**TRINITY_DN100586_c0_g1_i1.p1**ORFtype:completelen:119(+),score=-2.81,UniRef90_A0A2G9HV35|69.3|1.8e-37,Tryp_alpha_amyl|PF00234.23|7.6e-07,LTP_2|PF14368.7|0.0031,DUF1970|PF09301.11|0.038TRINITY_DN100586_c0_g1_i1:68-424(+)

Thanks for the help!

harta55 commented 4 years ago

Hey! Sorry for the delay. The newest version (0.10.2) will resolve this.

What does that sequence look like in your original transcriptome?