Open Rcperez opened 2 years ago
What version of interproscan? And how did you run it? The error seems to suggest it is erring on trying to read the first line, can you run head on the interpro xml file?
Thanks so much for your prompt reply. I opted to try: funannotate annotate -m docker, it is still running.
version: InterProScan-5.53-87.0
iprscan.xml has one line, even though the below command returned no errors.
~/funannotate-docker iprscan -i fun/ -m local -c 6 --iprscan_path ~/my_interproscan/interproscan-5.53-87.0/interproscan.sh
iprscan.xml is located in annotate_misc.
head iprscan.xml returns:
< / protein-matches> (spaces inserted artificially to make text visible)
command using local Interproscan:
~/funannotate-docker iprscan -i fun/ -m local -c 6 --iprscan_path ~/my_interproscan/interproscan-5.53-87.0/interproscan.sh
output after ~6 hours using 8 core 32 GB, Debian 4.19.208-1 x86_64:
Running InterProScan5 on 17478 proteins Important: you need to manually configure your interproscan.properties file for embedded workers. Will try to launch 6 interproscan processes, adjust -c,--cpus for your system InterProScan5 search has completed successfully! Results are here: fun/annotate_misc/iprscan.xml
I'm not sure the iprscan script will work in the docker image as it is not installed in the container and won't have access to the rest of your system. You likely need to run it manually.
understood, thank you.
I have the same issue.
I have installed a local copy of interproscan and have set $INTERPROSH shell variable with local interproscan.sh path.
If I understand you correctly, the local interproscan still won't work despite this setings ?
Are you using the latest release? using the latest docker image
Describe the bug Error parsing iprscan.xml while running funannotate annotate command. xml was generated by local run of InterProScan5.
What command did you issue?
First I used:
~/funannotate-docker annotate -i fun/ --cpus 8
Then I used:
~/funannotate-docker annotate -i ~/fun/ --iprscan ~/fun/annotate_misc/iprscan.xml
Below was reported for both.
Logfiles
[Dec 30 11:46 PM]: OS: Debian GNU/Linux 10, 8 cores, ~ 33 GB RAM. Python: 3.8.12 [Dec 30 11:46 PM]: Running 1.8.10 [Dec 30 11:47 PM]: No NCBI SBT file given, will use default, however if you plan to submit to NCBI, create one and pass it here '--sbt' [Dec 30 11:47 PM]: Found existing output directory fun. Warning, will re-use any intermediate files found. [Dec 30 11:47 PM]: Parsing input files [Dec 30 11:47 PM]: Existing tbl found: fun/predict_results/fungus.tbl [Dec 30 11:47 PM]: Adding Functional Annotation to fungus, NCBI accession: None [Dec 30 11:47 PM]: Annotation consists of: 17,581 gene models [Dec 30 11:47 PM]: 17,478 protein records loaded [Dec 30 11:47 PM]: Existing Pfam-A results found: fun/annotate_misc/annotations.pfam.txt [Dec 30 11:47 PM]: 16,481 annotations added [Dec 30 11:47 PM]: Running Diamond blastp search of UniProt DB version 2021_04 [Dec 30 11:47 PM]: 1,112 valid gene/product annotations from 1,526 total [Dec 30 11:47 PM]: Install eggnog-mapper or use webserver to improve functional annotation: https://github.com/jhcepas/eggnog-mapper [Dec 30 11:47 PM]: No Eggnog-mapper results found. [Dec 30 11:47 PM]: Combining UniProt/EggNog gene and product names using Gene2Product version 1.72 [Dec 30 11:47 PM]: 1,112 gene name and product description annotations added [Dec 30 11:47 PM]: Existing MEROPS results found: fun/annotate_misc/annotations.merops.txt [Dec 30 11:47 PM]: 489 annotations added [Dec 30 11:47 PM]: Existing CAZYme results found: fun/annotate_misc/annotations.dbCAN.txt [Dec 30 11:47 PM]: 373 annotations added [Dec 30 11:47 PM]: Existing BUSCO2 results found: fun/annotate_misc/annotations.busco.txt [Dec 30 11:47 PM]: 1,382 annotations added [Dec 30 11:47 PM]: Existing Phobius results found: fun/annotate_misc/phobius.results.txt [Dec 30 11:47 PM]: SignalP not installed, secretome prediction less accurate using only Phobius [Dec 30 11:47 PM]: 1,050 secretome and 2,848 transmembane annotations added [Dec 30 11:47 PM]: Parsing InterProScan5 XML file [Dec 30 11:47 PM]: CMD ERROR: /venv/bin/python /venv/lib/python3.8/site-packages/funannotate/aux_scripts/iprscan2annotations.py fun/annotate_misc/iprscan.xml fun/annotate_misc/annotations.iprscan.txt [Dec 30 11:47 PM]: Traceback (most recent call last): File "/venv/lib/python3.8/site-packages/funannotate/auxscripts/iprscan2annotations.py", line 32, in
for , elem in tree:
File "/venv/lib/python3.8/xml/etree/ElementTree.py", line 1227, in iterator
yield from pullparser.read_events()
File "/venv/lib/python3.8/xml/etree/ElementTree.py", line 1302, in read_events
raise event
File "/venv/lib/python3.8/xml/etree/ElementTree.py", line 1274, in feed
self._parser.feed(data)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 1
OS/Install Information
Checking dependencies for 1.8.10
You are running Python v 3.8.12. Now checking python packages... biopython: 1.77 goatools: 1.1.6 matplotlib: 3.5.1 natsort: 8.0.1 numpy: 1.21.4 pandas: 1.3.5 psutil: 5.8.0 requests: 2.26.0 scikit-learn: 1.0.1 scipy: 1.5.3 seaborn: 0.11.2 All 11 python packages installed
You are running Perl v b'5.026002'. Now checking perl modules... Carp: 1.38 Clone: 0.42 DBD::SQLite: 1.64 DBD::mysql: 4.046 DBI: 1.642 DB_File: 1.855 Data::Dumper: 2.173 File::Basename: 2.85 File::Which: 1.23 Getopt::Long: 2.5 Hash::Merge: 0.300 JSON: 4.02 LWP::UserAgent: 6.39 Logger::Simple: 2.0 POSIX: 1.76 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.12 Tie::File: 1.02 URI::Escape: 3.31 YAML: 1.29 threads: 2.15 threads::shared: 1.56 ERROR: Bio::Perl not installed, install with cpanm Bio::Perl ERROR: local::lib not installed, install with cpanm local::lib
Checking Environmental Variables... $FUNANNOTATE_DB=/opt/databases $PASAHOME=/venv/opt/pasa-2.4.1 $TRINITYHOME=/venv/opt/trinity-2.8.5 $EVM_HOME=/venv/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/venv/config ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir
Checking external dependencies... Traceback (most recent call last): File "/venv/bin/ete3", line 6, in
from ete3.tools.ete import main
File "/venv/lib/python3.8/site-packages/ete3/tools/ete.py", line 55, in
from . import (ete_split, ete_expand, ete_annotate, ete_ncbiquery, ete_view,
File "/venv/lib/python3.8/site-packages/ete3/tools/ete_view.py", line 48, in
from .. import (Tree, PhyloTree, TextFace, RectFace, faces, TreeStyle, CircleFace, AttrFace,
ImportError: cannot import name 'TextFace' from 'ete3' (/venv/lib/python3.8/site-packages/ete3/init.py)
PASA: 2.4.1
CodingQuarry: 2.0
Trinity: 2.8.5
augustus: 3.3.3
bamtools: bamtools 2.5.1
bedtools: bedtools v2.30.0
blat: BLAT v36
diamond: 2.0.13
exonerate: exonerate 2.4.0
fasta: no way to determine
glimmerhmm: 3.0.4
gmap: 2017-11-15
hisat2: 2.2.1
hmmscan: HMMER 3.3.2 (Nov 2020)
hmmsearch: HMMER 3.3.2 (Nov 2020)
java: 11.0.9.1-internal
kallisto: 0.46.1
mafft: v7.490 (2021/Oct/30)
makeblastdb: makeblastdb 2.2.31+
minimap2: 2.23-r1111
proteinortho: 6.0.16
pslCDnaFilter: no way to determine
salmon: salmon 0.14.1
samtools: samtools 1.12
snap: 2006-07-28
stringtie: 2.1.7
tRNAscan-SE: 2.0.9 (July 2021)
tantan: tantan 26
tbl2asn: no way to determine, likely 25.X
tblastn: tblastn 2.2.31+
trimal: trimAl v1.4.rev15 build[2013-12-17]
trimmomatic: 0.39
ERROR: emapper.py not installed
ERROR: ete3 not installed
ERROR: gmes_petap.pl not installed
ERROR: pigz not installed
ERROR: signalp not installed