Open EmilieSmeets22 opened 7 months ago
The tool/pipeline you ran does not originate from https://github.com/NBISweden/GAAS repository but from this repository (https://github.com/NBISweden/pipelines-nextflow/). I transferred your issue here to be better monitored
Looking at your command for running Interproscan manually, it's not the same as the command run in the workflow. The command you're running in the workflow should look like:
interproscan.sh \
-cpu 20 \
-i ${fasta_name} \
-f tsv \
-dp \
--iprlookup --goterms -t p -dra -appl TIGRFAM,FunFam,SFLD,PANTHER,Gene3D,Hamap,Coils,SMART,CDD,PRINTS,PIRSR,AntiFam,Pfam \
\
-o ${prefix}.tsv
The differences in output is likely due to the differences in options.
I apologize for the delay, I had an issue with my local installation of InterProScan but it is fixed now.
As you advised I re-run Interproscan manually with the additional parameters. As expected we can see now see these additional annotations at the GFF gene's entries. However It is not answering my original question regarding interproscan hits missing.
## Query with usual parameters:
# Run Interproscan
[doutree@plop] ~ $ interproscan.sh -i input/Daucus_carota.gene_chr_AGAT_chr01_proteins.fasta -f TSV -o output/Daucus_carota.gene_chr_prot.fasta_interpro.tsv
# Add annotations to GFF file
[doutree@plop] ~ $ ipr_update_gff input/Daucus_carota.gene_chr_AGAT_chr01.gff output/Daucus_carota.gene_chr_prot.fasta_interpro.tsv > output/Daucus_carota.gene_chr_prot.fasta_IPS.gff
# Output GFF file
[doutree@plop] ~ $ head output/Daucus_carota.gene_chr_prot.fasta_IPS.gff
##gff-version 3
chr01 maker gene 24795 31012 . - . ID=DcarChr1G00000010;Dbxref=InterPro:IPR007271,PANTHER:PTHR10231,PFAM:PF04142;
chr01 maker mRNA 24795 31012 . - . ID=DcarChr1G00000010.1;Parent=DcarChr1G00000010;Dbxref=InterPro:IPR007271,PANTHER:PTHR10231,PFAM:PF04142;
chr01 maker exon 24795 24945 . - . ID=nbis-exon-1;Parent=DcarChr1G00000010.1
chr01 maker exon 26435 26604 . - . ID=nbis-exon-2;Parent=DcarChr1G00000010.1
chr01 maker exon 27851 27929 . - . ID=nbis-exon-3;Parent=DcarChr1G00000010.1
chr01 maker exon 28302 28423 . - . ID=nbis-exon-4;Parent=DcarChr1G00000010.1
chr01 maker exon 30953 31012 . - . ID=nbis-exon-5;Parent=DcarChr1G00000010.1
chr01 maker CDS 24795 24945 . - 1 ID=cds-5;Parent=DcarChr1G00000010.1
chr01 maker CDS 26435 26604 . - 0 ID=cds-4;Parent=DcarChr1G00000010.1
## With additional parameters
# Run Interproscan
[doutree@plop] ~ $ interproscan.sh -cpu 20 -i input/Daucus_carota.gene_chr_AGAT_chr01_proteins.fasta -f TSV -dp --iprlookup --goterms -t p -dra -appl TIGRFAM,FunFam,SFLD,PANTHER,Gene3D,Hamap,Coils,SMART,CDD,PRINTS,PIRSR,AntiFam,Pfam -o output/Daucus_carota.gene_chr_prot.fasta_interpro_updateParam.tsv
# Add annotations to GFF file
[doutree@plop] ~ $ ipr_update_gff input/Daucus_carota.gene_chr_AGAT_chr01.gff output/Daucus_carota.gene_chr_prot.fasta_interpro_updateParam.tsv > output/Daucus_carota.gene_chr_prot.fasta_IPS_updateParam.gff
# Output GFF file
[doutree@plop] ~ $ head output/Daucus_carota.gene_chr_prot.fasta_IPS_updateParam.gff
##gff-version 3
chr01 maker gene 24795 31012 . - . ID=DcarChr1G00000010;Dbxref=InterPro:IPR007271,PANTHER:PTHR10231,PFAM:PF04142;Ontology_term=GO:0000139,GO:0015136,GO:0015165,GO:0016020,GO:0030173,GO:0090481;
chr01 maker mRNA 24795 31012 . - . ID=DcarChr1G00000010.1;Parent=DcarChr1G00000010;Dbxref=InterPro:IPR007271,PANTHER:PTHR10231,PFAM:PF04142;Ontology_term=GO:0000139,GO:0015136,GO:0015165,GO:0016020,GO:0030173,GO:0090481;
chr01 maker exon 24795 24945 . - . ID=nbis-exon-1;Parent=DcarChr1G00000010.1
chr01 maker exon 26435 26604 . - . ID=nbis-exon-2;Parent=DcarChr1G00000010.1
chr01 maker exon 27851 27929 . - . ID=nbis-exon-3;Parent=DcarChr1G00000010.1
chr01 maker exon 28302 28423 . - . ID=nbis-exon-4;Parent=DcarChr1G00000010.1
chr01 maker exon 30953 31012 . - . ID=nbis-exon-5;Parent=DcarChr1G00000010.1
chr01 maker CDS 24795 24945 . - 1 ID=cds-5;Parent=DcarChr1G00000010.1
chr01 maker CDS 26435 26604 . - 0 ID=cds-4;Parent=DcarChr1G00000010.1
My concern is regarding the potential differences in sensitivity or quality threshold between the Interproscan query run within GAAS and manual queries. My worry is that these differences may result in missing functional annotations. For instance, in this particular example, the manual queries of Interproscan return hits on the first gene:
chr01 maker gene 24795 31012 . - . ID=DcarChr1G00000010;Dbxref=InterPro:IPR007271
Which are not returned by the Interproscan query of GAAS:
chr01 maker gene 24795 31012 . - . ID=DcarChr1G00000000001;Name=CSTLP1;makerName=DcarChr1G00000010
chr01 maker mRNA 24795 31012 . - . ID=DcarChr1M00000000001;Parent=DcarChr1G00000000001;Name=CSTLP1;makerName=DcarChr1G00000010.1;product=CMP-sialic acid transporter 1;uniprot_id=Q654D9
While I appreciate the advantages of using GAAS as an automated tool, I am also cautious about compromising the quality of the annotation. If there is a parameter that can be adjusted to ensure consistent annotations, I would greatly appreciate your guidance in that regard.
Thank you in advance Emilie
Perhaps @Juke34 has a better understanding, but I'm confused here. The functional annotation workflow doesn't use GAAS. I'm not familiar with how you've reached this step, so I'm not sure where GAAS has come into it.
I am sorry if I am was not clear: my issue concerns the way InterProScan is run via GAAS. I run InterProScan manually with seemingly the same parameters but I get different results/hits. I ran also the same query for one gene sequence, as example, on InterProScan website; those results are different from the hits return that GAAS as well. Results are in the first comment. Thank you in advance
GAAS doesn't run InterProScan though. You mean this Nextflow workflow right ( which isn't GAAS, but some subworkflows use the GAAS package scripts)?
Since you're using conda, are you sure the Database versions in your local run are the same as the databases in the conda package?
I apologize you are right I run the functional annotation step of the Nextflow workflow: https://github.com/NBISweden/pipelines-nextflow/blob/master/subworkflows/functional_annotation/README.md
I can see the version of InterProScan being run but I am not sure how to find information about the database version, either local or via Nextflow. Both instances are run via conda. I checked on InterProScan website and there again I cannot see databases version information: https://www.ebi.ac.uk/interpro/search/sequence/
I just tested making an interproscan installation using conda to see what the databases are: It seems the version is tied to the build. However something I didn't know about is the databases are not packaged with conda. It instructs to download the databases which our pipeline doesn't do as I didn't know about it from the nf-core module.
$ conda create -n interproscan-env interproscan
Retrieving notices: ...working... done
Channels:
- conda-forge
- bioconda
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /opt/conda/envs/interproscan-env
added / updated specs:
- interproscan
The following packages will be downloaded:
package | build
---------------------------|-----------------
_libgcc_mutex-0.1 | conda_forge 3 KB conda-forge
_openmp_mutex-4.5 | 2_gnu 23 KB conda-forge
alsa-lib-1.2.11 | hd590300_1 542 KB conda-forge
blast-2.15.0 | pl5321h6f7f691_1 146.0 MB bioconda
bzip2-1.0.8 | hd590300_5 248 KB conda-forge
c-ares-1.28.1 | hd590300_0 165 KB conda-forge
ca-certificates-2024.2.2 | hbcca054_0 152 KB conda-forge
cairo-1.18.0 | h3faef2a_0 959 KB conda-forge
cath-tools-0.16.5 | h78a066a_0 11.5 MB bioconda
curl-8.8.0 | he654da7_0 163 KB conda-forge
emboss-6.6.0 | h6debe1e_0 94.5 MB bioconda
entrez-direct-21.6 | he881be0_0 14.0 MB bioconda
expat-2.6.2 | h59595ed_0 134 KB conda-forge
font-ttf-dejavu-sans-mono-2.37| hab24e00_0 388 KB conda-forge
font-ttf-inconsolata-3.000 | h77eed37_0 94 KB conda-forge
font-ttf-source-code-pro-2.038| h77eed37_0 684 KB conda-forge
font-ttf-ubuntu-0.83 | h77eed37_2 1.5 MB conda-forge
fontconfig-2.14.2 | h14ed4e7_0 266 KB conda-forge
fonts-conda-ecosystem-1 | 0 4 KB conda-forge
fonts-conda-forge-1 | 0 4 KB conda-forge
freetype-2.12.1 | h267a509_2 620 KB conda-forge
gettext-0.22.5 | h59595ed_2 464 KB conda-forge
gettext-tools-0.22.5 | h59595ed_2 2.6 MB conda-forge
giflib-5.2.2 | hd590300_0 75 KB conda-forge
graphite2-1.3.13 | h59595ed_1003 95 KB conda-forge
harfbuzz-8.5.0 | hfac3d4d_0 1.5 MB conda-forge
hmmer-3.4 | hdbdd923_1 11.1 MB bioconda
hmmer2-2.3.2 | h031d066_9 392 KB bioconda
icu-73.2 | h59595ed_0 11.5 MB conda-forge
interproscan-5.59_91.0 | hec16e2b_1 167.0 MB bioconda
keyutils-1.6.1 | h166bdaf_0 115 KB conda-forge
krb5-1.21.2 | h659d440_0 1.3 MB conda-forge
lcms2-2.16 | hb7c19ff_0 239 KB conda-forge
ld_impl_linux-64-2.40 | h55db66e_0 697 KB conda-forge
lerc-4.0.0 | h27087fc_0 275 KB conda-forge
libasprintf-0.22.5 | h661eb56_2 42 KB conda-forge
libasprintf-devel-0.22.5 | h661eb56_2 33 KB conda-forge
libcups-2.3.3 | h4637d8d_4 4.3 MB conda-forge
libcurl-8.8.0 | hca28451_0 396 KB conda-forge
libdeflate-1.20 | hd590300_0 70 KB conda-forge
libedit-3.1.20191231 | he28a2e2_2 121 KB conda-forge
libev-4.33 | hd590300_2 110 KB conda-forge
libexpat-2.6.2 | h59595ed_0 72 KB conda-forge
libffi-3.4.2 | h7f98852_5 57 KB conda-forge
libgcc-ng-13.2.0 | h77fa898_7 758 KB conda-forge
libgd-2.3.3 | h119a65a_9 219 KB conda-forge
libgettextpo-0.22.5 | h59595ed_2 167 KB conda-forge
libgettextpo-devel-0.22.5 | h59595ed_2 36 KB conda-forge
libgfortran-ng-7.5.0 | h14aa051_20 23 KB conda-forge
libgfortran4-7.5.0 | h14aa051_20 1.2 MB conda-forge
libglib-2.80.2 | hf974151_0 3.7 MB conda-forge
libgomp-13.2.0 | h77fa898_7 412 KB conda-forge
libiconv-1.17 | hd590300_2 689 KB conda-forge
libidn2-2.3.7 | hd590300_0 124 KB conda-forge
libjpeg-turbo-3.0.0 | hd590300_1 604 KB conda-forge
libnghttp2-1.58.0 | h47da74e_1 617 KB conda-forge
libnsl-2.0.1 | hd590300_0 33 KB conda-forge
libpng-1.6.43 | h2797004_0 281 KB conda-forge
libsqlite-3.45.3 | h2797004_0 840 KB conda-forge
libssh2-1.11.0 | h0841786_0 265 KB conda-forge
libstdcxx-ng-13.2.0 | hc0a3c3a_7 3.7 MB conda-forge
libtiff-4.6.0 | h1dd3fc0_3 276 KB conda-forge
libunistring-0.9.10 | h7f98852_0 1.4 MB conda-forge
libuuid-2.38.1 | h0b41bf4_0 33 KB conda-forge
libwebp-1.4.0 | h2c329e2_0 90 KB conda-forge
libwebp-base-1.4.0 | hd590300_0 429 KB conda-forge
libxcb-1.15 | h0b41bf4_0 375 KB conda-forge
libxcrypt-4.4.36 | hd590300_1 98 KB conda-forge
libzlib-1.2.13 | hd590300_5 60 KB conda-forge
ncbi-vdb-3.1.1 | h4ac6f70_0 10.7 MB bioconda
ncurses-6.5 | h59595ed_0 867 KB conda-forge
openjdk-11.0.23 | h24d6bf4_0 164.0 MB conda-forge
openssl-3.3.0 | h4ab18f5_3 2.8 MB conda-forge
pcre-8.45 | h9c3ff4c_0 253 KB conda-forge
pcre2-10.43 | hcad00b1_0 929 KB conda-forge
perl-5.32.1 | 7_hd590300_perl5 12.7 MB conda-forge
perl-archive-tar-2.40 | pl5321hdfd78af_0 33 KB bioconda
perl-carp-1.50 | pl5321hd8ed1ab_0 22 KB conda-forge
perl-common-sense-3.75 | pl5321hd8ed1ab_0 20 KB conda-forge
perl-compress-raw-bzip2-2.201| pl5321h166bdaf_0 54 KB conda-forge
perl-compress-raw-zlib-2.202| pl5321h166bdaf_0 83 KB conda-forge
perl-encode-3.21 | pl5321hd590300_0 1.7 MB conda-forge
perl-exporter-5.74 | pl5321hd8ed1ab_0 19 KB conda-forge
perl-exporter-tiny-1.002002| pl5321hd8ed1ab_0 28 KB conda-forge
perl-extutils-makemaker-7.70| pl5321hd8ed1ab_0 154 KB conda-forge
perl-io-compress-2.201 | pl5321hdbdd923_2 84 KB bioconda
perl-io-zlib-1.14 | pl5321hdfd78af_0 12 KB bioconda
perl-json-4.10 | pl5321hdfd78af_0 56 KB bioconda
perl-json-xs-2.34 | pl5321h4ac6f70_6 66 KB bioconda
perl-list-moreutils-0.430 | pl5321hdfd78af_0 32 KB bioconda
perl-list-moreutils-xs-0.430| pl5321h031d066_2 50 KB bioconda
perl-parent-0.241 | pl5321hd8ed1ab_0 13 KB conda-forge
perl-pathtools-3.75 | pl5321h166bdaf_0 49 KB conda-forge
perl-scalar-list-utils-1.63| pl5321h166bdaf_0 50 KB conda-forge
perl-storable-3.15 | pl5321h166bdaf_0 70 KB conda-forge
perl-types-serialiser-1.01 | pl5321hdfd78af_0 13 KB bioconda
pftools-2.3.5 | h4333106_0 263 KB bioconda
pip-24.0 | pyhd8ed1ab_0 1.3 MB conda-forge
pixman-0.43.2 | h59595ed_0 378 KB conda-forge
pthread-stubs-0.4 | h36c2ea0_1001 5 KB conda-forge
python-3.12.3 |hab00c5b_0_cpython 30.5 MB conda-forge
readline-8.2 | h8228510_1 275 KB conda-forge
setuptools-70.0.0 | pyhd8ed1ab_0 472 KB conda-forge
sfld-1.1 | h031d066_3 196 KB bioconda
tk-8.6.13 |noxft_h4845f30_101 3.2 MB conda-forge
tzdata-2024a | h0c530f3_0 117 KB conda-forge
wget-1.21.4 | hda4d442_0 752 KB conda-forge
wheel-0.43.0 | pyhd8ed1ab_1 57 KB conda-forge
xorg-fixesproto-5.0 | h7f98852_1002 9 KB conda-forge
xorg-inputproto-2.3.2 | h7f98852_1002 19 KB conda-forge
xorg-kbproto-1.0.7 | h7f98852_1002 27 KB conda-forge
xorg-libice-1.1.1 | hd590300_0 57 KB conda-forge
xorg-libsm-1.2.4 | h7391055_0 27 KB conda-forge
xorg-libx11-1.8.9 | h8ee46fc_0 809 KB conda-forge
xorg-libxau-1.0.11 | hd590300_0 14 KB conda-forge
xorg-libxdmcp-1.1.3 | h7f98852_0 19 KB conda-forge
xorg-libxext-1.3.4 | h0b41bf4_2 49 KB conda-forge
xorg-libxfixes-5.0.3 | h7f98852_1004 18 KB conda-forge
xorg-libxi-1.7.10 | h7f98852_0 46 KB conda-forge
xorg-libxrender-0.9.11 | hd590300_0 37 KB conda-forge
xorg-libxt-1.3.0 | hd590300_1 370 KB conda-forge
xorg-libxtst-1.2.3 | h7f98852_1002 31 KB conda-forge
xorg-recordproto-1.14.2 | h7f98852_1002 8 KB conda-forge
xorg-renderproto-0.11.1 | h7f98852_1002 9 KB conda-forge
xorg-xextproto-7.3.0 | h0b41bf4_1003 30 KB conda-forge
xorg-xproto-7.0.31 | h7f98852_1007 73 KB conda-forge
xz-5.2.6 | h166bdaf_0 409 KB conda-forge
zlib-1.2.13 | hd590300_5 91 KB conda-forge
zstd-1.5.6 | ha6fb4c9_0 542 KB conda-forge
------------------------------------------------------------
Total: 725.5 MB
The following NEW packages will be INSTALLED:
_libgcc_mutex conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
_openmp_mutex conda-forge/linux-64::_openmp_mutex-4.5-2_gnu
alsa-lib conda-forge/linux-64::alsa-lib-1.2.11-hd590300_1
blast bioconda/linux-64::blast-2.15.0-pl5321h6f7f691_1
bzip2 conda-forge/linux-64::bzip2-1.0.8-hd590300_5
c-ares conda-forge/linux-64::c-ares-1.28.1-hd590300_0
ca-certificates conda-forge/linux-64::ca-certificates-2024.2.2-hbcca054_0
cairo conda-forge/linux-64::cairo-1.18.0-h3faef2a_0
cath-tools bioconda/linux-64::cath-tools-0.16.5-h78a066a_0
curl conda-forge/linux-64::curl-8.8.0-he654da7_0
emboss bioconda/linux-64::emboss-6.6.0-h6debe1e_0
entrez-direct bioconda/linux-64::entrez-direct-21.6-he881be0_0
expat conda-forge/linux-64::expat-2.6.2-h59595ed_0
font-ttf-dejavu-s~ conda-forge/noarch::font-ttf-dejavu-sans-mono-2.37-hab24e00_0
font-ttf-inconsol~ conda-forge/noarch::font-ttf-inconsolata-3.000-h77eed37_0
font-ttf-source-c~ conda-forge/noarch::font-ttf-source-code-pro-2.038-h77eed37_0
font-ttf-ubuntu conda-forge/noarch::font-ttf-ubuntu-0.83-h77eed37_2
fontconfig conda-forge/linux-64::fontconfig-2.14.2-h14ed4e7_0
fonts-conda-ecosy~ conda-forge/noarch::fonts-conda-ecosystem-1-0
fonts-conda-forge conda-forge/noarch::fonts-conda-forge-1-0
freetype conda-forge/linux-64::freetype-2.12.1-h267a509_2
gettext conda-forge/linux-64::gettext-0.22.5-h59595ed_2
gettext-tools conda-forge/linux-64::gettext-tools-0.22.5-h59595ed_2
giflib conda-forge/linux-64::giflib-5.2.2-hd590300_0
graphite2 conda-forge/linux-64::graphite2-1.3.13-h59595ed_1003
harfbuzz conda-forge/linux-64::harfbuzz-8.5.0-hfac3d4d_0
hmmer bioconda/linux-64::hmmer-3.4-hdbdd923_1
hmmer2 bioconda/linux-64::hmmer2-2.3.2-h031d066_9
icu conda-forge/linux-64::icu-73.2-h59595ed_0
interproscan bioconda/linux-64::interproscan-5.59_91.0-hec16e2b_1
keyutils conda-forge/linux-64::keyutils-1.6.1-h166bdaf_0
krb5 conda-forge/linux-64::krb5-1.21.2-h659d440_0
lcms2 conda-forge/linux-64::lcms2-2.16-hb7c19ff_0
ld_impl_linux-64 conda-forge/linux-64::ld_impl_linux-64-2.40-h55db66e_0
lerc conda-forge/linux-64::lerc-4.0.0-h27087fc_0
libasprintf conda-forge/linux-64::libasprintf-0.22.5-h661eb56_2
libasprintf-devel conda-forge/linux-64::libasprintf-devel-0.22.5-h661eb56_2
libcups conda-forge/linux-64::libcups-2.3.3-h4637d8d_4
libcurl conda-forge/linux-64::libcurl-8.8.0-hca28451_0
libdeflate conda-forge/linux-64::libdeflate-1.20-hd590300_0
libedit conda-forge/linux-64::libedit-3.1.20191231-he28a2e2_2
libev conda-forge/linux-64::libev-4.33-hd590300_2
libexpat conda-forge/linux-64::libexpat-2.6.2-h59595ed_0
libffi conda-forge/linux-64::libffi-3.4.2-h7f98852_5
libgcc-ng conda-forge/linux-64::libgcc-ng-13.2.0-h77fa898_7
libgd conda-forge/linux-64::libgd-2.3.3-h119a65a_9
libgettextpo conda-forge/linux-64::libgettextpo-0.22.5-h59595ed_2
libgettextpo-devel conda-forge/linux-64::libgettextpo-devel-0.22.5-h59595ed_2
libgfortran-ng conda-forge/linux-64::libgfortran-ng-7.5.0-h14aa051_20
libgfortran4 conda-forge/linux-64::libgfortran4-7.5.0-h14aa051_20
libglib conda-forge/linux-64::libglib-2.80.2-hf974151_0
libgomp conda-forge/linux-64::libgomp-13.2.0-h77fa898_7
libiconv conda-forge/linux-64::libiconv-1.17-hd590300_2
libidn2 conda-forge/linux-64::libidn2-2.3.7-hd590300_0
libjpeg-turbo conda-forge/linux-64::libjpeg-turbo-3.0.0-hd590300_1
libnghttp2 conda-forge/linux-64::libnghttp2-1.58.0-h47da74e_1
libnsl conda-forge/linux-64::libnsl-2.0.1-hd590300_0
libpng conda-forge/linux-64::libpng-1.6.43-h2797004_0
libsqlite conda-forge/linux-64::libsqlite-3.45.3-h2797004_0
libssh2 conda-forge/linux-64::libssh2-1.11.0-h0841786_0
libstdcxx-ng conda-forge/linux-64::libstdcxx-ng-13.2.0-hc0a3c3a_7
libtiff conda-forge/linux-64::libtiff-4.6.0-h1dd3fc0_3
libunistring conda-forge/linux-64::libunistring-0.9.10-h7f98852_0
libuuid conda-forge/linux-64::libuuid-2.38.1-h0b41bf4_0
libwebp conda-forge/linux-64::libwebp-1.4.0-h2c329e2_0
libwebp-base conda-forge/linux-64::libwebp-base-1.4.0-hd590300_0
libxcb conda-forge/linux-64::libxcb-1.15-h0b41bf4_0
libxcrypt conda-forge/linux-64::libxcrypt-4.4.36-hd590300_1
libzlib conda-forge/linux-64::libzlib-1.2.13-hd590300_5
ncbi-vdb bioconda/linux-64::ncbi-vdb-3.1.1-h4ac6f70_0
ncurses conda-forge/linux-64::ncurses-6.5-h59595ed_0
openjdk conda-forge/linux-64::openjdk-11.0.23-h24d6bf4_0
openssl conda-forge/linux-64::openssl-3.3.0-h4ab18f5_3
pcre conda-forge/linux-64::pcre-8.45-h9c3ff4c_0
pcre2 conda-forge/linux-64::pcre2-10.43-hcad00b1_0
perl conda-forge/linux-64::perl-5.32.1-7_hd590300_perl5
perl-archive-tar bioconda/noarch::perl-archive-tar-2.40-pl5321hdfd78af_0
perl-carp conda-forge/noarch::perl-carp-1.50-pl5321hd8ed1ab_0
perl-common-sense conda-forge/noarch::perl-common-sense-3.75-pl5321hd8ed1ab_0
perl-compress-raw~ conda-forge/linux-64::perl-compress-raw-bzip2-2.201-pl5321h166bdaf_0
perl-compress-raw~ conda-forge/linux-64::perl-compress-raw-zlib-2.202-pl5321h166bdaf_0
perl-encode conda-forge/linux-64::perl-encode-3.21-pl5321hd590300_0
perl-exporter conda-forge/noarch::perl-exporter-5.74-pl5321hd8ed1ab_0
perl-exporter-tiny conda-forge/noarch::perl-exporter-tiny-1.002002-pl5321hd8ed1ab_0
perl-extutils-mak~ conda-forge/noarch::perl-extutils-makemaker-7.70-pl5321hd8ed1ab_0
perl-io-compress bioconda/linux-64::perl-io-compress-2.201-pl5321hdbdd923_2
perl-io-zlib bioconda/noarch::perl-io-zlib-1.14-pl5321hdfd78af_0
perl-json bioconda/noarch::perl-json-4.10-pl5321hdfd78af_0
perl-json-xs bioconda/linux-64::perl-json-xs-2.34-pl5321h4ac6f70_6
perl-list-moreuti~ bioconda/noarch::perl-list-moreutils-0.430-pl5321hdfd78af_0
perl-list-moreuti~ bioconda/linux-64::perl-list-moreutils-xs-0.430-pl5321h031d066_2
perl-parent conda-forge/noarch::perl-parent-0.241-pl5321hd8ed1ab_0
perl-pathtools conda-forge/linux-64::perl-pathtools-3.75-pl5321h166bdaf_0
perl-scalar-list-~ conda-forge/linux-64::perl-scalar-list-utils-1.63-pl5321h166bdaf_0
perl-storable conda-forge/linux-64::perl-storable-3.15-pl5321h166bdaf_0
perl-types-serial~ bioconda/noarch::perl-types-serialiser-1.01-pl5321hdfd78af_0
pftools bioconda/linux-64::pftools-2.3.5-h4333106_0
pip conda-forge/noarch::pip-24.0-pyhd8ed1ab_0
pixman conda-forge/linux-64::pixman-0.43.2-h59595ed_0
pthread-stubs conda-forge/linux-64::pthread-stubs-0.4-h36c2ea0_1001
python conda-forge/linux-64::python-3.12.3-hab00c5b_0_cpython
readline conda-forge/linux-64::readline-8.2-h8228510_1
setuptools conda-forge/noarch::setuptools-70.0.0-pyhd8ed1ab_0
sfld bioconda/linux-64::sfld-1.1-h031d066_3
tk conda-forge/linux-64::tk-8.6.13-noxft_h4845f30_101
tzdata conda-forge/noarch::tzdata-2024a-h0c530f3_0
wget conda-forge/linux-64::wget-1.21.4-hda4d442_0
wheel conda-forge/noarch::wheel-0.43.0-pyhd8ed1ab_1
xorg-fixesproto conda-forge/linux-64::xorg-fixesproto-5.0-h7f98852_1002
xorg-inputproto conda-forge/linux-64::xorg-inputproto-2.3.2-h7f98852_1002
xorg-kbproto conda-forge/linux-64::xorg-kbproto-1.0.7-h7f98852_1002
xorg-libice conda-forge/linux-64::xorg-libice-1.1.1-hd590300_0
xorg-libsm conda-forge/linux-64::xorg-libsm-1.2.4-h7391055_0
xorg-libx11 conda-forge/linux-64::xorg-libx11-1.8.9-h8ee46fc_0
xorg-libxau conda-forge/linux-64::xorg-libxau-1.0.11-hd590300_0
xorg-libxdmcp conda-forge/linux-64::xorg-libxdmcp-1.1.3-h7f98852_0
xorg-libxext conda-forge/linux-64::xorg-libxext-1.3.4-h0b41bf4_2
xorg-libxfixes conda-forge/linux-64::xorg-libxfixes-5.0.3-h7f98852_1004
xorg-libxi conda-forge/linux-64::xorg-libxi-1.7.10-h7f98852_0
xorg-libxrender conda-forge/linux-64::xorg-libxrender-0.9.11-hd590300_0
xorg-libxt conda-forge/linux-64::xorg-libxt-1.3.0-hd590300_1
xorg-libxtst conda-forge/linux-64::xorg-libxtst-1.2.3-h7f98852_1002
xorg-recordproto conda-forge/linux-64::xorg-recordproto-1.14.2-h7f98852_1002
xorg-renderproto conda-forge/linux-64::xorg-renderproto-0.11.1-h7f98852_1002
xorg-xextproto conda-forge/linux-64::xorg-xextproto-7.3.0-h0b41bf4_1003
xorg-xproto conda-forge/linux-64::xorg-xproto-7.0.31-h7f98852_1007
xz conda-forge/linux-64::xz-5.2.6-h166bdaf_0
zlib conda-forge/linux-64::zlib-1.2.13-hd590300_5
zstd conda-forge/linux-64::zstd-1.5.6-ha6fb4c9_0
Proceed ([y]/n)? y
Downloading and Extracting Packages:
Preparing transaction: done
Verifying transaction: done
Executing transaction: |
######################################
# First time usage please README !!! #
######################################
The databases are huge and consequently not shipped within this installation.
Please download and install the Databases manually by following the commands below:
!!! /!\ Edit the 2 first lines to match the wished version of the DB /!\ !!!
Commands:
=========
# See here for latest db available: https://github.com/ebi-pf-team/interproscan or http://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/
# Set versions
version_major=5.59
version_minor=91.0
CONDA_PREFIX=/the/path/to/your/interproscan/conda/env/
# get the md5 of the databases
wget http://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/${version_major}-${version_minor}/interproscan-${version_major}-${version_minor}-64-bit.tar.gz.md5
# get the databases (with core because much faster to download)
wget http://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/${version_major}-${version_minor}/interproscan-${version_major}-${version_minor}-64-bit.tar.gz
# checksum
md5sum -c interproscan-${version_major}-${version_minor}-64-bit.tar.gz.md5
# untar gz
tar xvzf interproscan-${version_major}-${version_minor}-64-bit.tar.gz
# remove the sample DB bundled by default
rm -rf $CONDA_PREFIX/share/InterProScan/data/
# copy the new db
cp -r interproscan-${version_major}-${version_minor}/data $CONDA_PREFIX/share/InterProScan/
INFO:
====
Phobius (licensed software), SignalP, SMART (licensed components) and TMHMM use
licensed code and data provided by third parties. If you wish to run these
analyses it will be necessary for you to obtain a licence from the vendor and
configure your local InterProScan installation to use them.
(see more information in $CONDA_PREFIX/share/InterProScan/data/<db>)
done
#
# To activate this environment, use
#
# $ conda activate interproscan-env
#
# To deactivate an active environment, use
#
# $ conda deactivate
Did you also follow the extra instructions when making the local interproscan conda installation?
I've made a pull request to update the Interproscan module on nf-core to fix the missing database issue and once that's in, I can include it here.
Thank you for looking into this.
I run Nextflow workflow using conda but my local InterProScan install is not run via Conda but it is similar to what you described above:
We download the following tarballs from http://ftp.ebi.ac.uk/pub/software/unix/iprscan interproscan-core-5.67-99.0.tar.gz interproscan-data-5.67-99.0.tar.gz We unpack interproscan-core-5.67-99.0.tar.gz to $INSTALLDIR We unpack interproscan-data-5.67-99.0.tar.gz to another location: /data/prod/Tools/InterProScan/5.67-99.0/data
Then we are performing the following commands:
sed -i "s@EASEL_DIR=@EASEL_DIR=$INSTALLDIRHMMER_interproscan/3.1b2/easel@" $INSTALLDIR/src/sfld/1.1/Makefile
cd $INSTALLDIR/src/sfld/1.1/ && make
cp -f $INSTALLDIR/src/sfld/1.1/sfld_postprocess $INSTALLDIR/bin/sfld/
cp -f $INSTALLDIR/src/sfld/1.1/sfld_preprocess.py $INSTALLDIR/bin/sfld/
We adapt the following line in $INSTALLDIR/interproscan.properties, so that it is making use of the data from interproscan-data-5.67-99.0.tar.gz data.directory=/data/prod/Tools/InterProScan/5.67-99.0/data
I understand that I am using a newer version of InterProScan compared to your pipeline, so indeed that's a nice bug catch. However I do no think this is the reason why different InterProScan hits are found on the same gene sequence, I will be curious to test it when it is ready.
Thank you
Hi,
I have different InterProScan results between running GAAS and running InterProScan manually, with the same input files. I do not see any difference in the input arguments. As you know the backend very well maybe you could help me identifying what causes these differences.
What questions are:
Running InterProScan within GAAS:
Run InterProScan manually:
I am using one chromosome as a test (chr01) from a public source, a carrot reference genome. I can provide the input files if that helps to identify the discordance.
I have run the first gene sequence thru web InterProScan and here are the results:
Sequence used:
Thank you for your cooperation.
Kind regards, Emilie