nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
322 stars 85 forks source link

AntiSMASH v6 Parsing Issue #549

Closed hutchinsonmiri closed 3 years ago

hutchinsonmiri commented 3 years ago

Are you using the latest release? I am running FunAnnotate in Conda and have updated to 1.8.0 but I see that 1.8.3 is the most recent version. If there is a way to force the update to 1.8.3, please let me know. Updating to 1.8.0 did not solve the issue. AntiSMASH was working fine prior to the v6 update.

Describe the bug When running 'funannotate annotate' the antismash gbk is not being recognized properly and no clusters/enzymes/COGS (Found 0 clusters, 0 biosynthetic enyzmes, and 0 smCOGs predicted by antiSMASH) are generated. I read other issues suggesting that this may have to do with partial gene annotations, so I attempted to remove the '<' and '>' symbols and re-run but this didn't seem to work. This is the error that pops up: Traceback (most recent call last): File "/home/miriamh/anaconda3/envs/funannotate_env/bin/funannotate", line 660, in main() File "/home/miriamh/anaconda3/envs/funannotate_env/bin/funannotate", line 650, in main mod.main(arguments) File "/home/miriamh/anaconda3/envs/funannotate_env/lib/python2.7/site-packages/funannotate/annotate.py", line 1205, in main with open(mibig_blast, 'r') as input: IOError: [Errno 2] No such file or directory: 'predictions_folder/annotate_misc/antismash/smcluster.MIBiG.blast.txt'

What command did you issue? funannotate annotate -i predictions_folder --cpus 12 --sbt ~/fungal_genomes/template.sbt

Logfiles [02/10/21 12:08:09]: /home/miriamh/anaconda3/envs/funannotate_env/bin/funannotate annotate -i predictions_folder --cpus 12 --sbt /home/miriamh/fungal_genomes/template.sbt

[02/10/21 12:08:09]: OS: linux2, 32 cores, ~ 132 GB RAM. Python: 2.7.17 [02/10/21 12:08:09]: Running 1.8.0 [02/10/21 12:08:09]: Found existing output directory predictions_folder. Warning, will re-use any intermediate files found. [02/10/21 12:08:09]: Parsing input files [02/10/21 12:08:09]: Existing tbl found: predictions_folder/predict_results/Aspergillus_nidulans_C_geo.tbl [02/10/21 12:08:21]: Adding Functional Annotation to Aspergillus nidulans, NCBI accession: None [02/10/21 12:08:21]: Annotation consists of: 12,278 gene models [02/10/21 12:08:21]: 12,232 protein records loaded [02/10/21 12:08:22]: Existing Pfam-A results found: predictions_folder/annotate_misc/annotations.pfam.txt [02/10/21 12:08:22]: 12,490 annotations added [02/10/21 12:08:22]: Running Diamond blastp search of UniProt DB version 2020_03 [02/10/21 12:08:24]: 697 valid gene/product annotations from 1,005 total [02/10/21 12:08:24]: Existing Eggnog-mapper results found: predictions_folder/annotate_misc/eggnog.emapper.annotations [02/10/21 12:08:24]: Parsing EggNog Annotations [02/10/21 12:08:24]: 6,193 COG and EggNog annotations added [02/10/21 12:08:24]: Combining UniProt/EggNog gene and product names using Gene2Product version 1.62 [02/10/21 12:08:24]: 697 gene name and product description annotations added [02/10/21 12:08:24]: Existing MEROPS results found: predictions_folder/annotate_misc/annotations.merops.txt [02/10/21 12:08:24]: 304 annotations added [02/10/21 12:08:24]: Existing CAZYme results found: predictions_folder/annotate_misc/annotations.dbCAN.txt [02/10/21 12:08:24]: 466 annotations added [02/10/21 12:08:24]: Existing BUSCO2 results found: predictions_folder/annotate_misc/annotations.busco.txt [02/10/21 12:08:24]: 1,253 annotations added [02/10/21 12:08:24]: Existing Phobius results found: predictions_folder/annotate_misc/phobius.results.txt [02/10/21 12:08:24]: Existing SignalP results found: predictions_folder/annotate_misc/signalp.results.txt [02/10/21 12:08:25]: 642 secretome and 2,366 transmembane annotations added [02/10/21 12:08:25]: Now parsing antiSMASH v6 results, finding SM clusters [02/10/21 12:08:31]: Found 0 clusters, 0 biosynthetic enyzmes, and 0 smCOGs predicted by antiSMASH [02/10/21 12:08:31]: bedtools intersect -wo -a predictions_folder/annotate_misc/antismash/clusters.bed -b /home/miriamh/fungal_genomes/C_geophilum_careful/predictions_folder/predict_results/Aspergillus_nidulans_C_geo.gff3 [02/10/21 12:08:31]: Found 0 duplicated annotations, adding 66,668 valid annotations [02/10/21 12:08:31]: Parsing tbl file: /home/miriamh/fungal_genomes/C_geophilum_careful/predictions_folder/annotate_misc/genome.tbl [02/10/21 12:08:32]: Converting to final Genbank format, good luck! [02/10/21 12:08:32]: /home/miriamh/anaconda3/envs/funannotate_env/bin/python /home/miriamh/anaconda3/envs/funannotate_env/lib/python2.7/site-packages/funannotate/aux_scripts/tbl2asn_parallel.py -i predictions_folder/annotate_misc/tbl2asn/genome.tbl -f predictions_folder/annotate_misc/tbl2asn/genome.fsa -o predictions_folder/annotate_misc/tbl2asn --sbt /home/miriamh/fungal_genomes/template.sbt -d discrepency.report.txt -s Aspergillus nidulans -t -l paired-ends -v 1 -c 12 [02/10/21 12:10:12]: Creating AGP file and corresponding contigs file [02/10/21 12:10:12]: perl /home/miriamh/anaconda3/envs/funannotate_env/lib/python2.7/site-packages/funannotate/aux_scripts/fasta2agp.pl Aspergillus_nidulans_C_geo.scaffolds.fa [02/10/21 12:10:14]: Cross referencing SM cluster hits with MIBiG database version 1.4 [02/10/21 12:10:14]: diamond blastp --sensitive --query predictions_folder/annotate_misc/antismash/smcluster.proteins.fasta --threads 12 --out predictions_folder/annotate_misc/antismash/smcluster.MIBiG.blast.txt --db /home/miriamh/funannotate_db/mibig.dmnd --max-hsps 1 --evalue 0.001 --max-target-seqs 1 --outfmt 6

OS/Install Information YAML: 1.29 threads: 2.15 threads::shared: 1.56 All 27 Perl modules installed

Checking Environmental Variables... $FUNANNOTATE_DB=/home/miriamh/funannotate_db $PASAHOME=/home/miriamh/anaconda3/envs/funannotate_env/opt/pasa-2.4.1 $TRINITY_HOME=/home/miriamh/anaconda3/envs/funannotate_env/opt/trinity-2.8.5 $EVM_HOME=/home/miriamh/anaconda3/envs/funannotate_env/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/home/miriamh/anaconda3/envs/funannotate_env/config/ $GENEMARK_PATH=/home/miriamh/gmes_linux_64/ All 6 environmental variables are set

Checking external dependencies... PASA: 2.4.1 CodingQuarry: 2.0 Trinity: 2.8.5 augustus: 3.3.2 bamtools: bamtools 2.5.1 bedtools: bedtools v2.30.0 blat: BLAT v36 diamond: 0.9.24 emapper.py: 2.0.1b-2-g816e190 ete3: 3.1.1 exonerate: exonerate 2.4.0 fasta: no way to determine glimmerhmm: 3.0.4 gmap: 2017-11-15 hisat2: 2.2.0 hmmscan: HMMER 3.3.2 (Nov 2020) hmmsearch: HMMER 3.3.2 (Nov 2020) java: 1.8.0_152-release kallisto: 0.46.0 mafft: v7.475 (2020/Nov/23) makeblastdb: makeblastdb 2.2.31+ minimap2: 2.17-r941 proteinortho: 6.0.15 pslCDnaFilter: no way to determine salmon: salmon 0.14.1 samtools: samtools 1.7 signalp: 4.1 snap: 2006-07-28 stringtie: 2.1.4 tRNAscan-SE: 2.0.7 (Oct 2020) tantan: tantan 26 tbl2asn: no way to determine, likely 25.X tblastn: tblastn 2.2.31+ trimal: trimAl v1.4.rev15 build[2013-12-17] trimmomatic: 0.39 ERROR: gmes_petap.pl not installed

hutchinsonmiri commented 3 years ago

Also, thank you in advance for creating such a great package!

hutchinsonmiri commented 3 years ago

Was able to fix the issue by updating with python -m pip install git+https://github.com/nextgenusfs/funannotate.git. Apologies!!