Closed ShailNair closed 3 years ago
The error you get is usually observed when you are running InterProScan v5.47-82.0 for the first time. Its likely if you try running it again you will now run it successfully. If you are still having problems, then I would suggest you run the following command from the InterProScan installation directory:
python3 initial_setup.py
Afterwards, InterProScan should run as expected.
HI, I tried 3-4 times with the test data and twice with my data. i am getting the same error. The python3 initial_setup.py command does not return anything ( not even any error or success message).
i am surprised you are still getting the errors after running _python3 initialsetup.py. The _python3 initial_setup.py_doesnt return any message if successful.
what is the output of the command?
ls -l data/pirsf/3.10/sf_hmm_all*
ls -l data/pirsf/3.10/sf_hmm_all* -rw-rw-r-- 1 mcs mcs 608224141 10月 8 17:39 data/pirsf/3.10/sf_hmm_all -rw-rw-r-- 1 mcs mcs 1032192 11月 25 08:16 data/pirsf/3.10/sf_hmm_all.h3f -rw-rw-r-- 1 mcs mcs 0 11月 25 08:16 data/pirsf/3.10/sf_hmm_all.h3i -rw-rw-r-- 1 mcs mcs 2244608 11月 25 08:16 data/pirsf/3.10/sf_hmm_all.h3m -rw-rw-r-- 1 mcs mcs 2633728 11月 25 08:16 data/pirsf/3.10/sf_hmm_all.h3p
Ignore the 月 symbol. Its for month in chinese .
When i run on my samples. I GET SIMILIAR ERROR. Here is the run output
for SET in cat list.txt
do /home/mcs/soft/interproscan-5.47-82.0/interproscan.sh -i /home/mcs/gene/shail/metagenomics/CLEANED_FILES/sunbeam_decontaminated/leptolyngbya_cleaned/sunbeam_output/qc/squeezemeta_all_contig/protein_seq/$SET-protein-sequences.fa \ -f tsv \ -o /home/mcs/gene/shail/metagenomics/CLEANED_FILES/sunbeam_decontaminated/leptolyngbya_cleaned/sunbeam_output/qc/squeezemeta_all_contig/protein_seq/$SET-interpro-output.tsv -cpu 80 \ -iprlookup \ -goterms \ -pa done
25/11/2020 20:02:49:958 Welcome to InterProScan-5.47-82.0 25/11/2020 20:02:49:960 Running InterProScan v5 in STANDALONE mode... on Linux 25/11/2020 20:03:03:751 Loading file /home/mcs/gene/shail/metagenomics/CLEANED_FILES/sunbeam_decontaminated/leptolyngbya_cleaned/sunbeam_output/qc/squeezemeta_all_contig/protein_seq/01.100_days-protein-sequences.fa 25/11/2020 20:03:03:753 Running the following analyses: [CDD-3.17,Coils-2.2.1,Gene3D-4.2.0,Hamap-2020_01,MobiDBLite-2.0,PANTHER-15.0,Pfam-33.1,PIRSF-3.10,PRINTS-42.0,ProSitePatterns-2019_11,ProSiteProfiles-2019_11,SFLD-4,SMART-7.1,SUPERFAMILY-1.75,TIGRFAM-15.0] Available matches will be retrieved from the pre-calculated match lookup service.
Matches for any sequences that are not represented in the lookup service will be calculated locally. 25/11/2020 20:03:32:040 Uploaded 112359 unique sequences for analysis 25/11/2020 20:19:36:404 37% completed 25/11/2020 20:19:51:846 62% completed 25/11/2020 20:19:53:266 87% completed 2020-11-25 20:21:44,204 [amqEmbeddedWorkerJmsContainer-5] [uk.ac.ebi.interpro.scan.management.model.implementations.RunBinaryStep:199] ERROR - Command line failed with exit code: 1 Command: bin/hmmer/hmmer3/3.1b1/hmmsearch -Z 65000000 -E 0.001 --domE 0.00000001 --incdomE 0.00000001 --cpu 10 -o /home/mcs/gene/shail/metagenomics/CLEANED_FILES/sunbeam_decontaminated/leptolyngbya_cleaned/sunbeam_output/qc/squeezemeta_all_contig/fixeed_contig/temp/mcs1_20201125_200254341_1sp7//jobPanther/000000102001_000000102600.raw.out --domtblout /home/mcs/gene/shail/metagenomics/CLEANED_FILES/sunbeam_decontaminated/leptolyngbya_cleaned/sunbeam_output/qc/squeezemeta_all_contig/fixeed_contig/temp/mcs1_20201125_200254341_1sp7//jobPanther/000000102001_000000102600.raw.domtblout.out data/panther/15.0/panther.hmm /home/mcs/gene/shail/metagenomics/CLEANED_FILES/sunbeam_decontaminated/leptolyngbya_cleaned/sunbeam_output/qc/squeezemeta_all_contig/fixeed_contig/temp/mcs1_20201125_200254341_1sp7//jobPanther/000000102001_000000102600.fasta Error output from binary: Error: File format problem in trying to open HMM file data/panther/15.0/panther.hmm. Opened data/panther/15.0/panther.hmm.h3m, a pressed HMM file; but format of its .h3i file unrecognized
there is a problem with the indices in you data as i can see here:
-rw-rw-r-- 1 mcs mcs 0 11月 25 08:16 data/pirsf/3.10/sf_hmm_all.h3i
you might have to investigate why you are unable to generate the data indices file:
try regenerating the indices in the data for one database like PIRSF by running
bin/hmmer/hmmer3/3.3/hmmpress -f data/pirsf/3.10/sf_hmm_all
then let me know what you get when you run the command?
ls -l data/pirsf/3.10/sf_hmm_all*
Here is the output
base) [mcs@mcs1 interproscan-5.47-82.0]$ bin/hmmer/hmmer3/3.3/hmmpress -f data/pirsf/3.10/sf_hmm_all Working... done. Pressed and indexed 3283 HMMs (3283 names and 3283 accessions). Models pressed into binary file: data/pirsf/3.10/sf_hmm_all.h3m SSI index for binary model file: data/pirsf/3.10/sf_hmm_all.h3i Profiles (MSV part) pressed into: data/pirsf/3.10/sf_hmm_all.h3f Profiles (remainder) pressed into: data/pirsf/3.10/sf_hmm_all.h3p (base) [mcs@mcs1 interproscan-5.47-82.0]$ ls -l data/pirsf/3.10/sf_hmm_all* -rw-rw-r-- 1 mcs mcs 608224141 10月 8 17:39 data/pirsf/3.10/sf_hmm_all -rw-rw-r-- 1 mcs mcs 103345491 11月 25 22:11 data/pirsf/3.10/sf_hmm_all.h3f -rw-rw-r-- 1 mcs mcs 328421 11月 25 22:11 data/pirsf/3.10/sf_hmm_all.h3i -rw-rw-r-- 1 mcs mcs 252147360 11月 25 22:11 data/pirsf/3.10/sf_hmm_all.h3m -rw-rw-r-- 1 mcs mcs 296057244 11月 25 22:11 data/pirsf/3.10/sf_hmm_all.h3p
the data looks OK now for PIRSF, but you need to do the same for the other hmm based analyses. I will give you a list of commands
OK. Thanks for the help
the following commands should do:
bin/hmmer/hmmer3/3.3/hmmpress -f data/gene3d/4.2.0/gene3d_main.hmm
bin/hmmer/hmmer3/3.3/hmmpress -f data/hamap/2020_01/hamap.hmm.lib
bin/hmmer/hmmer3/3.3/hmmpress -f data/panther/15.0/panther.hmm
bin/hmmer/hmmer3/3.3/hmmpress -f data/pfam/33.1/pfam_a.hmm
bin/hmmer/hmmer3/3.1b1/hmmpress -f data/sfld/4/sfld.hmm
bin/hmmer/hmmer3/3.1b1/hmmpress -f data/superfamily/1.75/hmmlib_1.75
bin/hmmer/hmmer3/3.3/hmmpress -f data/tigrfam/15.0/TIGRFAMs_HMM.LIB
i also notice in the error message you have changed the cpu option for hmmer as in
bin/hmmer/hmmer3/3.3/hmmscan -E 0.01 --acc --cpu 10
assigning 10 cpus to one hmmer job may not improve performance, especially that you are running all the analyses in InterProSCan. how many cores does you machine have?
Yes. in the InterProscan.properties file I have changed CPU value to 10 for each job. I work on a server with 104 cores. After running the data indices command provided by you, I ran the test file and it was successfully executed. After that, I ran InterProscan on my samples , and the process is in progress without any error (I am running on batch samples). As you said, increasing the number of CPU's didn't improve performance. Is there a better way to improve performance without getting into errors?
this page describes some of the tips to improving performance. https://interproscan-docs.readthedocs.io/en/latest/ImprovingPerformance.html
Hi, thanks...i will try it for next run. the previous was successfully executed (though took almost 2 days).
I have the similar error message.
Then when I ran the commond bin/hmmer/hmmer3/3.3/hmmpress -f data/hamap/2020_01/hamap.hmm.lib
it said Error: File existence/permissions problem in trying to open HMM file data/hamap/2020_01/hamap.hmm.lib
@clwang4802 what do you get when you run the command:
ls -l data/hamap/2020_01/hamap.hmm.lib*
@clwang4802 what do you get when you run the command:
ls -l data/hamap/2020_01/hamap.hmm.lib*
@gsn7 It seems the hmm_all files rebuilt jobs were not able to complete in a "dirty" directory.
I just avoid the error by
@clwang4802 what do you get when you run the command:
ls -l data/hamap/2020_01/hamap.hmm.lib*
@gsn7 Before I removed the failed iprscan folder, I noticed that there is no "hamap/2020_01" folder but "2020_05"
I got rid of the error without deleting and reinstalling everything: a) Rename the directory hamap/2020_05 to hamap/2020_01 b) run these commands as suggested by @gsn7 up in the thread:
the following commands should do:
bin/hmmer/hmmer3/3.3/hmmpress -f data/gene3d/4.2.0/gene3d_main.hmm bin/hmmer/hmmer3/3.3/hmmpress -f data/hamap/2020_01/hamap.hmm.lib bin/hmmer/hmmer3/3.3/hmmpress -f data/panther/15.0/panther.hmm bin/hmmer/hmmer3/3.3/hmmpress -f data/pfam/33.1/pfam_a.hmm bin/hmmer/hmmer3/3.1b1/hmmpress -f data/sfld/4/sfld.hmm bin/hmmer/hmmer3/3.1b1/hmmpress -f data/superfamily/1.75/hmmlib_1.75 bin/hmmer/hmmer3/3.3/hmmpress -f data/tigrfam/15.0/TIGRFAMs_HMM.LIB
c) rename the directory back to its old name.
I suspect adding a soft-link named "2020_01" pointing to the offending directory would have probably saved the trouble of renaming the directory twice.
After doing this both test runs completed without complaint. Good luck!
I know the issue has been solved but I wanted to chime in since I had the same issue with Interproscan 5.52-86 recently. The problem was the that "initial_setup.py" did run without any issues, but the h3i index file for superfamily was a 0-byte file, even through the other files were generated just fine. After some troubleshooting, it turned out that running the script as "./initial_setup.py" rather than "python3 ./initial_setup.py" was loading Python2 instead of Python3, which resulted in the failed index. Re-running it under explicit Python3 solved the issue. I know that the Python3 call is clearly stated in the docs, but since this is such a sneaky issue (the script will run perfectly fine under Python2) it might be helpful to perhaps explicitly warn NOT TO use Python2 here, or include a check in the script itself to verify the version of Python that is running and exit if it's Python2? Just a thought.
Hi, I have downloaded the interproscan "wget https://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.60-92.0/interproscan-5.60-92.0-64-bit.tar.gz". I did not get any error in installation but after installation but when i run initial_setup.py command, I encountered this result,
python3 initial_setup.py python3: can't open file 'initial_setup.py': [Errno 2] No such file or directory
In addition to this when I run the test file using command, ./interproscan.sh -i test_all_appl.fasta -f tsv -dp I got the output .tsv file but during the analysis, I got many warning messages like this:
2023-02-01 18:00:49,435 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:129] WARN - Couldn't parse the given raw match line, because it is of an unexpected format. 2023-02-01 18:00:49,438 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:130] WARN - Unexpected Raw match line: Query sequence: 1 matches PIRSF001789: Nerve growth factor, subunit beta 2023-02-01 18:00:49,438 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:129] WARN - Couldn't parse the given raw match line, because it is of an unexpected format. 2023-02-01 18:00:49,438 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:130] WARN - Unexpected Raw match line: # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc 2023-02-01 18:00:49,438 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:129] WARN - Couldn't parse the given raw match line, because it is of an unexpected format. 2023-02-01 18:00:49,438 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:130] WARN - Unexpected Raw match line: 1 ! 339.3 1.4 1.1e-105 3.5e-102 1 252 [. 1 256 [. 1 257 [] 0.97 2023-02-01 18:00:49,438 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:129] WARN - Couldn't parse the given raw match line, because it is of an unexpected format. 2023-02-01 18:00:49,438 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:130] WARN - Unexpected Raw match line: Query sequence: 3 matches PIRSF001220: L-asparaginase/Glutamyl-tRNA(Gln) amidotransferase subunit D 2023-02-01 18:00:49,439 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:129] WARN - Couldn't parse the given raw match line, because it is of an unexpected format. 2023-02-01 18:00:49,439 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:130] WARN - Unexpected Raw match line: # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc 2023-02-01 18:00:49,439 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:129] WARN - Couldn't parse the given raw match line, because it is of an unexpected format. 2023-02-01 18:00:49,439 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:130] WARN - Unexpected Raw match line: 1 ! 296.3 3.4 4.8e-92 5.3e-89 3 323 .. 48 365 .. 46 370 .] 0.96 2023-02-01 18:00:49,439 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:129] WARN - Couldn't parse the given raw match line, because it is of an unexpected format. 2023-02-01 18:00:49,439 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:130] WARN - Unexpected Raw match line: and matches Sub-Family PIRSF500176: L-asparaginase/L-glutaminase 2023-02-01 18:00:49,439 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:129] WARN - Couldn't parse the given raw match line, because it is of an unexpected format. 2023-02-01 18:00:49,439 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:130] WARN - Unexpected Raw match line: # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc 2023-02-01 18:00:49,439 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:129] WARN - Couldn't parse the given raw match line, because it is of an unexpected format. 2023-02-01 18:00:49,439 [amqEmbeddedWorkerJmsContainer-6] [uk.ac.ebi.interpro.scan.io.pirsf.hmmer3.PirsfHmmer3RawMatchParser:130] WARN - Unexpected Raw match line: 1 ! 252.9 3.1 8.3e-79 9.1e-76 3 324 .. 50 367 .. 48 370 .] 0.91 01/02/2023 18:00:53:550 50% completed 01/02/2023 18:01:06:867 77% completed 01/02/2023 18:01:26:519 90% completed 01/02/2023 18:01:59:389 100% done: InterProScan analyses completed
I just wanted to know is my installation is successful and can I use interproscan for my own data??? and why I did not find initial_setup.py script in my installed version?
Hello,
initial_setup.py has been deprecated, you can now run python3 setup.py interproscan.properties
instead.
About the error you're experiencing, please add the following line to your interproscan.properties file:
pirsf.pl.binary.switches=--outfmt i5
That should fix it and your installation should be good to run.
Dear Tgrego, Thank You very much for the guidance. I have followed your advice and I did not find any warning message time. Once again thank you for your kind help.
02/02/2023 18:37:23:788 Welcome to InterProScan-5.60-92.0 02/02/2023 18:37:23:789 Running InterProScan v5 in STANDALONE mode... on Linux 02/02/2023 18:37:27:368 RunID: navi_20230202_183727283_isqv 02/02/2023 18:37:33:646 Loading file /home/navi/software/my_interproscan/interproscan-5.60-92.0/test_all_appl.fasta 02/02/2023 18:37:33:647 Running the following analyses: [AntiFam-7.0,CDD-3.20,Coils-2.2.1,FunFam-4.3.0,Gene3D-4.3.0,Hamap-2021_04,MobiDBLite-2.0,PANTHER-17.0,Pfam-35.0,PIRSF-3.10,PIRSR-2021_05,PRINTS-42.0,ProSitePatterns-2022_01,ProSiteProfiles-2022_01,SFLD-4,SMART-7.1,SUPERFAMILY-1.75,TIGRFAM-15.0] Pre-calculated match lookup service DISABLED. Please wait for match calculations to complete... 02/02/2023 18:37:46:593 25% completed 02/02/2023 18:37:59:058 51% completed 02/02/2023 18:38:15:895 75% completed 02/02/2023 18:38:29:311 90% completed 02/02/2023 18:38:45:072 100% done: InterProScan analyses completed
HI, Unfortunately, i received an error while running the test file.
Linux info.- Linux mcs1 3.10.0-1127.18.2.el7.x86_64 #1 SMP Sun Jul 26 15:27:06 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
python3 --version Python 3.7.8 java -version openjdk version "11.0.8-internal" 2020-07-14
I have downloaded and unzipped InterProScan 5.47-82.0 ran the initial setup and when I run the test file I get an error saying
ERROR - Command line failed with exit code: 1 Command: bin/hmmer/hmmer3/3.3/hmmscan -E 0.01 --acc --cpu 10 -o /home/mcs/soft/interproscan-5.47-82.0/temp/mcs1_20201125_091103548_klwo//jobPIRSF/000000000001_000000000006.raw.out --domtblout /home/mcs/soft/interproscan-5.47-82.0/temp/mcs1_20201125_091103548_klwo//jobPIRSF/000000000001_000000000006.raw.domtblout.out data/pirsf/3.10/sf_hmm_all /home/mcs/soft/interproscan-5.47-82.0/temp/mcs1_20201125_091103548_klwo//jobPIRSF/000000000001_000000000006.fasta Error output from binary:
Error: File format problem, trying to open HMM file data/pirsf/3.10/sf_hmm_all. Opened data/pirsf/3.10/sf_hmm_all.h3m, a pressed HMM file; but format of its .h3i file unrecognized
Attached -terminal output-
terminal.txt