Closed athulmenon closed 3 years ago
Sorry -- that was a dumb mistake. Image is rebuilding now, it should be ready in ~ 1 hour.
Okay it should be up on docker hub, pull to get the update.
Hi,
Thanks for the fix. The test ran successfully. But when I used my data, an antismash error popped up. I am not sure if it is an issue with the update or my antismash results. Antismash results were generated using the funannotate "remote" module. Can you please check.
[Jan 29 04:37 PM]: OS: Debian GNU/Linux 10, 12 cores, ~ 74 GB RAM. Python: 3.7.9
[Jan 29 04:37 PM]: Running 1.8.4
[Jan 29 04:37 PM]: No NCBI SBT file given, will use default, however if you plan to submit to NCBI, create one and pass it here '--sbt'
[Jan 29 04:37 PM]: Checking GenBank file for annotation
Skipped 11 annotations: 11 pseudo genes; 0 no CDS; 0 duplicated features
[Jan 29 04:38 PM]: Adding Functional Annotation to Fusarium graminearum PH-1, NCBI accession: WGS:DS23
[Jan 29 04:38 PM]: Annotation consists of: 13,726 gene models
[Jan 29 04:38 PM]: 13,313 protein records loaded
[Jan 29 04:38 PM]: Running HMMer search of PFAM version 33.1
[Jan 29 04:46 PM]: 13,595 annotations added
[Jan 29 04:46 PM]: Running Diamond blastp search of UniProt DB version 2020_06
[Jan 29 04:48 PM]: 1,002 valid gene/product annotations from 1,377 total
[Jan 29 04:48 PM]: Existing Eggnog-mapper results found: fusarium_graminearum_annotation/annotate_misc/eggnog.emapper.annotations
[Jan 29 04:48 PM]: Parsing EggNog Annotations
[Jan 29 04:48 PM]: 7,569 COG and EggNog annotations added
[Jan 29 04:48 PM]: Combining UniProt/EggNog gene and product names using Gene2Product version 1.65
[Jan 29 04:48 PM]: 1,002 gene name and product description annotations added
[Jan 29 04:48 PM]: Running Diamond blastp search of MEROPS version 12.0
[Jan 29 04:48 PM]: 404 annotations added
[Jan 29 04:48 PM]: Annotating CAZYmes using HMMer search of dbCAN version 9.0
[Jan 29 04:50 PM]: 523 annotations added
[Jan 29 04:50 PM]: Annotating proteins with BUSCO dikarya models
[Jan 29 04:52 PM]: 1,272 annotations added
[Jan 29 04:52 PM]: Skipping phobius predictions, try funannotate remote -m phobius
[Jan 29 04:52 PM]: Skipping secretome: neither SignalP nor Phobius searches were run
[Jan 29 04:52 PM]: 0 secretome and 0 transmembane annotations added
[Jan 29 04:53 PM]: Parsing InterProScan5 XML file
[Jan 29 04:56 PM]: Now parsing antiSMASH v6 results, finding SM clusters
Traceback (most recent call last):
File "/venv/bin/funannotate", line 713, in
Thank you. Athul
Would you be able to send me link to antiSMASH results or the antiSMASH GBK file? Seems related to how it expected the fasta headers to be named, what is your header naming scheme?
Please find the .gbk file from the link. https://drive.google.com/file/d/1XKLk5Q615Iptk5EeJFhg0aUH9djy3bk0/view?usp=sharing
The fasta header naming inside antismash results folder is ">FGSG_00001-T1 FGSG_00001"
The fasta header of protein sequence downloaded from NCBI is ">XP_011315562.1 hypothetical protein FGSG_11579 [Fusarium graminearum PH-1]" Hope this helps.
Thanks -- its due to the NCBI scaffold names -- did you also get a link to the web results from antiSMASH? I'd like to see how they are enumerating the clusters, it might have changed since this is now a new version (6). Previously if your contig was named chr1
, then it would have called the first cluster named 1.1
and the second would be 1.2
, etc. Since apparently bio python is parsing the record.id for this scaffold as NC_026474.1
, its choking on the .1
. So if I can see how antiSMASH is presenting this I can do the same.
So if I simply strip off the .1
which corresponds in these old assemblies to a version number, you'd get a result like this:
NC_026474.1 160289 217648 Cluster_26474.1 0 +
NC_026474.1 1436186 1451304 Cluster_26474.2 0 +
NC_026474.1 5548489 5588101 Cluster_26474.3 0 +
NC_026474.1 5573182 5609251 Cluster_26474.4 0 +
NC_026474.1 5856809 5873522 Cluster_26474.5 0 +
NC_026474.1 5861438 5902658 Cluster_26474.6 0 +
NC_026474.1 7285342 7327399 Cluster_26474.7 0 +
NC_026474.1 7452834 7514009 Cluster_26474.8 0 +
NC_026474.1 7503245 7544494 Cluster_26474.9 0 +
NC_026474.1 7683951 7728756 Cluster_26474.10 0 +
NC_026474.1 7695583 7746953 Cluster_26474.11 0 +
NC_026474.1 9783754 9803449 Cluster_26474.12 0 +
NC_026474.1 10707528 10728727 Cluster_26474.13 0 +
NC_026474.1 10863841 10906818 Cluster_26474.14 0 +
NC_026474.1 10875350 10920861 Cluster_26474.15 0 +
NC_026474.1 11089281 11130019 Cluster_26474.16 0 +
NC_026474.1 11159537 11224335 Cluster_26474.17 0 +
NC_026474.1 11196940 11244849 Cluster_26474.18 0 +
NC_026474.1 11406232 11449541 Cluster_26474.19 0 +
NC_026475.1 573352 613635 Cluster_26475.1 0 +
NC_026475.1 2059829 2101313 Cluster_26475.2 0 +
NC_026475.1 2245996 2280024 Cluster_26475.3 0 +
NC_026475.1 2561027 2612669 Cluster_26475.4 0 +
NC_026475.1 2575165 2621917 Cluster_26475.5 0 +
NC_026475.1 2644785 2686385 Cluster_26475.6 0 +
NC_026475.1 2669744 2686385 Cluster_26475.7 0 +
NC_026475.1 3484779 3505484 Cluster_26475.8 0 +
NC_026475.1 3538428 3582546 Cluster_26475.9 0 +
NC_026475.1 3542871 3555980 Cluster_26475.10 0 +
NC_026475.1 5508387 5553529 Cluster_26475.11 0 +
NC_026475.1 6060982 6107254 Cluster_26475.12 0 +
NC_026475.1 6097187 6140276 Cluster_26475.13 0 +
NC_026475.1 6638479 6658431 Cluster_26475.14 0 +
NC_026475.1 6753340 6773689 Cluster_26475.15 0 +
NC_026475.1 7152432 7200957 Cluster_26475.16 0 +
NC_026475.1 7380657 7423933 Cluster_26475.17 0 +
NC_026475.1 7915743 7937760 Cluster_26475.18 0 +
NC_026476.1 9801 45452 Cluster_26476.1 0 +
NC_026476.1 2024300 2076412 Cluster_26476.2 0 +
NC_026476.1 3384770 3434831 Cluster_26476.3 0 +
NC_026476.1 4098102 4136177 Cluster_26476.4 0 +
NC_026476.1 5906380 5956888 Cluster_26476.5 0 +
NC_026476.1 6019166 6099885 Cluster_26476.6 0 +
NC_026476.1 7255232 7302481 Cluster_26476.7 0 +
NC_026476.1 7456514 7478171 Cluster_26476.8 0 +
NC_026476.1 7682802 7694587 Cluster_26476.9 0 +
NC_026477.1 49741 70044 Cluster_26477.1 0 +
NC_026477.1 86520 130134 Cluster_26477.2 0 +
NC_026477.1 260243 300046 Cluster_26477.3 0 +
NC_026477.1 630873 663475 Cluster_26477.4 0 +
NC_026477.1 4151999 4171516 Cluster_26477.5 0 +
NC_026477.1 4497434 4549293 Cluster_26477.6 0 +
NC_026477.1 4498286 4546990 Cluster_26477.7 0 +
NC_026477.1 6668266 6689813 Cluster_26477.8 0 +
NC_026477.1 7275310 7321978 Cluster_26477.9 0 +
But I'm not sure if in antiSMASH v6 if these are how the cluster names appear in the html output or not, ie Cluster_26477.9
is that what the last cluster is called?
Hi Jon, Thanks for looking into the issue. Please find the link to antismash html output. https://fungismash.secondarymetabolites.org/upload/fungi-69ba57f6-ccf0-4f8f-b3d6-d306e2ac70a7/index.html
Hope this helps. Athul
Hi Jon, Please let me know if I can pull the updated image if you have fixed the issue. I will run and let you know.
I have one more query, I want to add SignalP db into my annotation, how can I include it to the present docker wrapper. Thanks for the support. Athul
It should be up now. Because antiSMASH has changed how they display on website to regions that are composed of multiple clusters I'm no longer going to try to match that result with the names. But I think the parsing error is fixed.
You will need to create a new docker image and install signalP in the image - I can't include it due to licensing reasons.
Hi Jon,
Thanks for the fix. It worked without any errors. I tried to ran the compare module with the .gbk files, it ran without any error, but there are some warnings which I would like to bring into your attention.
[Feb 03 06:41 PM]: OS: Debian GNU/Linux 10, 12 cores, ~ 74 GB RAM. Python: 3.7.9 [Feb 03 06:41 PM]: Running 1.8.4 [Feb 03 06:41 PM]: Now parsing 2 genomes [Feb 03 06:41 PM]: working on Fusarium equiseti [Feb 03 06:42 PM]: working on Fusarium oxysporum f. sp. lycopersici 4287 [Feb 03 06:42 PM]: Summarizing secondary metabolism gene clusters [Feb 03 06:43 PM]: Summarizing PFAM domain results [Feb 03 06:43 PM]: Summarizing InterProScan results [Feb 03 06:43 PM]: Loading InterPro descriptions [Feb 03 06:43 PM]: Summarizing MEROPS protease results [Feb 03 06:43 PM]: found 41/96 MEROPS familes with stdev >= 1.000000 /venv/lib/python3.7/site-packages/funannotate/library.py:7865: MatplotlibDeprecationWarning: Calling add_axes() without argument is deprecated since 3.3 and will be removed two minor releases later. You may want to use add_subplot() instead. cbar_ax = fig.add_axes(shrink=0.4) [Feb 03 06:43 PM]: Summarizing CAZyme results [Feb 03 06:43 PM]: found 59/144 CAZy familes with stdev >= 1.000000 /venv/lib/python3.7/site-packages/funannotate/library.py:7865: MatplotlibDeprecationWarning: Calling add_axes() without argument is deprecated since 3.3 and will be removed two minor releases later. You may want to use add_subplot() instead. cbar_ax = fig.add_axes(shrink=0.4) [Feb 03 06:43 PM]: No COG annotations found [Feb 03 06:43 PM]: No SignalP annotations found [Feb 03 06:43 PM]: Summarizing fungal transcription factors /venv/lib/python3.7/site-packages/funannotate/library.py:7865: MatplotlibDeprecationWarning: Calling add_axes() without argument is deprecated since 3.3 and will be removed two minor releases later. You may want to use add_subplot() instead. cbar_ax = fig.add_axes(shrink=0.4) [Feb 03 06:43 PM]: Running GO enrichment for each genome WARNING: skipping Fusarium_oxysporum_f._sp._lycopersici_4287.txt as no GO terms /venv/lib/python3.7/site-packages/funannotate/compare.py:803: FutureWarning: The default value of regex will change from True to False in a future version. df.columns = df.columns.str.replace(r'^# ', '') [Feb 03 06:45 PM]: Running orthologous clustering tool, ProteinOrtho. This may take awhile... [Feb 03 06:51 PM]: Compiling all annotations for each genome [Feb 03 06:51 PM]: Skipping RAxML phylogeny as at least 4 taxa are required [Feb 03 06:51 PM]: Compressing results to output file: compare_out.tar.gz [Feb 03 06:52 PM]: Funannotate compare completed successfully!
Thanks for the support. Regards, Athul
Hi,
I am running in the latest docker version of funannotate. I have ran annotate module somedays back without any error. I have pulled a latest docker version after one of your fix. Now when I try to run the annotate module the below error comes. I tried with test module the same error persists. Can you please let me know how to fix.
`./funannotate-docker test -t annotate --cpus 6
######################################################### Running
main()
File "/venv/bin/funannotate", line 703, in main
mod.main(arguments)
File "/venv/lib/python3.7/site-packages/funannotate/annotate.py", line 316, in main
help='Annotated if genome not masked and skip bad contigs')
File "/venv/lib/python3.7/argparse.py", line 1373, in add_argument
return self._add_action(action)
File "/venv/lib/python3.7/argparse.py", line 1736, in _add_action
self._optionals._add_action(action)
File "/venv/lib/python3.7/argparse.py", line 1577, in _add_action
action = super(_ArgumentGroup, self)._add_action(action)
File "/venv/lib/python3.7/argparse.py", line 1387, in _add_action
self._check_conflict(action)
File "/venv/lib/python3.7/argparse.py", line 1526, in _check_conflict
conflict_handler(action, confl_optionals)
File "/venv/lib/python3.7/argparse.py", line 1535, in _handle_conflict_error
raise ArgumentError(action, message % conflict_string)
argparse.ArgumentError: argument --force: conflicting option string: --force
#########################################################
ERROR:
funannotate annotate
unit testing Downloading: https://osf.io/97pyn/download?version=1 Bytes: 341476 CMD: funannotate annotate --genbank Genome_one.gbk -o annotate --cpus 6 --iprscan genome_one.iprscan.xml --eggnog genome_one.emapper.annotations ######################################################### Traceback (most recent call last): File "/venv/bin/funannotate", line 713, infunannotate annotate
test failed - check logfiles ######################################################### `OS
Checking dependencies for 1.8.4
You are running Python v 3.7.9. Now checking python packages... biopython: 1.78 goatools: 1.0.15 matplotlib: 3.3.3 natsort: 7.1.0 numpy: 1.19.5 pandas: 1.2.1 psutil: 5.8.0 requests: 2.25.1 scikit-learn: 0.24.1 scipy: 1.5.3 seaborn: 0.11.1 All 11 python packages installed
You are running Perl v b'5.026002'. Now checking perl modules... Bio::Perl: 1.007002 Carp: 1.38 Clone: 0.42 DBD::SQLite: 1.64 DBD::mysql: 4.046 DBI: 1.642 DB_File: 1.855 Data::Dumper: 2.173 File::Basename: 2.85 File::Which: 1.23 Getopt::Long: 2.5 Hash::Merge: 0.300 JSON: 4.02 LWP::UserAgent: 6.39 Logger::Simple: 2.0 POSIX: 1.76 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.12 Tie::File: 1.02 URI::Escape: 3.31 YAML: 1.29 threads: 2.15 threads::shared: 1.56 All 27 Perl modules installed
Checking Environmental Variables... $FUNANNOTATE_DB=/opt/databases $PASAHOME=/venv/opt/pasa-2.4.1 $TRINITYHOME=/venv/opt/trinity-2.8.5 $EVM_HOME=/venv/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/venv/config ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir
Checking external dependencies... Traceback (most recent call last): File "/venv/bin/ete3", line 6, in
from ete3.tools.ete import main
File "/venv/lib/python3.7/site-packages/ete3/tools/ete.py", line 55, in
from . import (ete_split, ete_expand, ete_annotate, ete_ncbiquery, ete_view,
File "/venv/lib/python3.7/site-packages/ete3/tools/ete_view.py", line 48, in
from .. import (Tree, PhyloTree, TextFace, RectFace, faces, TreeStyle, CircleFace, AttrFace,
ImportError: cannot import name 'TextFace' from 'ete3' (/venv/lib/python3.7/site-packages/ete3/init.py)
PASA: 2.4.1
CodingQuarry: 2.0
Trinity: 2.8.5
augustus: 3.3.3
bamtools: bamtools 2.5.1
bedtools: bedtools v2.30.0
blat: BLAT v36
diamond: 2.0.6
exonerate: exonerate 2.4.0
fasta: no way to determine
glimmerhmm: 3.0.4
gmap: 2017-11-15
hisat2: 2.2.1
hmmscan: HMMER 3.3.1 (Jul 2020)
hmmsearch: HMMER 3.3.1 (Jul 2020)
java: 11.0.8-internal
kallisto: 0.46.1
mafft: v7.475 (2020/Nov/23)
makeblastdb: makeblastdb 2.2.31+
minimap2: 2.17-r941
proteinortho: 6.0.16
pslCDnaFilter: no way to determine
salmon: salmon 0.14.1
samtools: samtools 1.10
snap: 2006-07-28
stringtie: 2.1.4
tRNAscan-SE: 2.0.7 (Oct 2020)
tantan: tantan 13
tbl2asn: no way to determine, likely 25.X
tblastn: tblastn 2.2.31+
trimal: trimAl v1.4.rev15 build[2013-12-17]
trimmomatic: 0.39
ERROR: emapper.py not installed
ERROR: ete3 not installed
ERROR: gmes_petap.pl not installed
ERROR: signalp not installed
Thanks for the tool. Athul