bioconda / bioconda-recipes

Conda recipes for the bioconda channel.
https://bioconda.github.io
MIT License
1.62k stars 3.25k forks source link

interproscan recipe needs help #21601

Closed Juke34 closed 3 years ago

Juke34 commented 4 years ago

I started to work on dependencies needed for interproscan. Here a resume was is needed: https://github.com/ebi-pf-team/interproscan/issues/95
7 dependencies were missing... I'm close to have resolved all of them. There is 2 remaining problems:
[x] For rpsbproc we need to update the blast recipe(s) see https://github.com/bioconda/bioconda-recipes/pull/21497 for blast 2.9 and https://github.com/bioconda/bioconda-recipes/pull/20874 for blast 2.10 [x] for Cath-tools, the recipe is too long to be done and reaches time limit (5 hours) https://github.com/bioconda/bioconda-recipes/pull/21443

Any help to resolve the 2 remaining problem is very welcome

Juke34 commented 4 years ago

Cath-tools recipe is fixed now.

Juke34 commented 4 years ago

rpsbproc is in blast version 2.9 now. We can go ahead with the InterProScan recipe now.

jolespin commented 4 years ago

Is there an interproscan recipe? I can't find it.

Juke34 commented 4 years ago

Not yet it is ongoing. I'm waiting for a fix in a new release of interproscan to finally finish a working recipe. This recipe has been a long journey...

Juke34 commented 4 years ago

see here https://github.com/bioconda/bioconda-recipes/pull/22802 and here https://github.com/ebi-pf-team/interproscan/issues/155

Sofie8 commented 3 years ago

Hi @Juke34 is there an update on the Interproscan Conda install? Thanks!

Juke34 commented 3 years ago

The interproscan team released a version fixing the issue ebi-pf-team/interproscan#155, but as you can see here #22802 there is still problems remaining I din't had time to fix. I have changed job since and cannot work on it anymore. After having spent so much time on this recipe I'm sad to not have succeed to finish it. The main problem now is the two different versions of blast that were interfering, we must modify the blast/2.2.19 recipe.

Sofie8 commented 3 years ago

Ok, thanks for your quick reply! Maybe @LeeBergstrand or @jmtsuji can help or find this interesting to finish it. I would find it so much easier if it could be installed via conda! Thanks for your work on this and good luck with your new job.

abretaud commented 3 years ago

Maybe I can try to help finish the job, it would be so sad to stop so close to success :) Do you have a rough idea of what would be needed on the blast/2.2.19 recipe?

Juke34 commented 3 years ago

We need two versions of blast, but the exes in the bin are overwritten, so we should change the recipe of the blast/2.2.19 to change the name of the exe e.g. adding the version as prefix or suffix. But to be compliant with previous use of blast/2.2.19 we should have twice the exe e.g blast and blast_2.2.19 like that all tools using this version will still work when calling blast. But when installing both version, the lastest version will overwrite blast with the newest version and blast_2.2.19 will remain. But that means the order we define them will matter in the recipe. We need to ask the conda team what they think about the solution @bgruening .

gsn7 commented 3 years ago

all legacy blast including blast_2.2.19 are no longer required as the applications (eg., prodom) requiring them have been removed/updated from InterProScan. the blast dependency is only the newest version

Juke34 commented 3 years ago

Interesting, so we could replace binary.blast.2.2.19.path=${bin.directory}/blast/2.2.19 by binary.blast.2.2.19.path=blast then in the config?

gsn7 commented 3 years ago

Interesting, so we could replace binary.blast.2.2.19.path=${bin.directory}/blast/2.2.19 by binary.blast.2.2.19.path=blast then in the config?

yep, we will remove the property `binary.blast.2.2.19.path' in the next version anyways

Juke34 commented 3 years ago
Juke34 commented 3 years ago

Hi @gsn7 the last version of Interopscan fails prematuraly compared to the previous version I was using:

08:22:16 BIOCONDA INFO (OUT) [INFO] -------------------------------------------------------------
08:22:16 BIOCONDA INFO (OUT) [ERROR] COMPILATION ERROR :
08:22:16 BIOCONDA INFO (OUT) [INFO] -------------------------------------------------------------
08:22:16 BIOCONDA INFO (OUT) [ERROR] $SRC_DIR/core/io/src/main/java/uk/ac/ebi/interpro/scan/io/match/prosite/PrositePfsearchMatchParser.java:[61,17] cannot find symbol
08:22:16 BIOCONDA INFO (OUT)   symbol:   method strip()
08:22:16 BIOCONDA INFO (OUT)   location: variable line of type java.lang.String
08:22:16 BIOCONDA INFO (OUT) [INFO] 1 error
08:22:16 BIOCONDA INFO (OUT) [INFO] -------------------------------------------------------------
08:22:16 BIOCONDA INFO (OUT) [INFO] ------------------------------------------------------------------------
08:22:16 BIOCONDA INFO (OUT) [INFO] Reactor Summary for InterProScan 5.48-83.0:
08:22:16 BIOCONDA INFO (OUT) [INFO]
08:22:16 BIOCONDA INFO (OUT) [INFO] InterProScan ....................................... SUCCESS [  2.098 s]
08:22:16 BIOCONDA INFO (OUT) [INFO] util ............................................... SUCCESS [03:13 min]
08:22:16 BIOCONDA INFO (OUT) [INFO] generic-jpa-dao .................................... SUCCESS [01:15 min]
08:22:16 BIOCONDA INFO (OUT) [INFO] InterProScan Domain Model .......................... SUCCESS [01:42 min]
08:22:16 BIOCONDA INFO (OUT) [INFO] InterProScan IO .................................... FAILURE [  2.405 s]
gsn7 commented 3 years ago

@Juke34 I have updated the release to tag the correct release version. the compilation should work now.

Juke34 commented 3 years ago

Is there any reason why there is 3 versions 5.48, 5.48-83.0_01 and 5.48-83.0 all from the same commit 7c1afd9 from december? I'm running a test now but I think it was pointing on the same version when I had the method strip() error. EDIT: Link to the two first does not work MD5 for the last is wrong

gsn7 commented 3 years ago

the release version is 5.48-83.0. the others are a result of some tag mismanagement. the MD5, size, etc are for ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/5.48-83.0/interproscan-5.48-83.0-64-bit.tar.gz, we will have to put a comment to clarify that

Juke34 commented 3 years ago

Thank you for the clarification I got the following error

14:26:23 BIOCONDA INFO (OUT)   File "/opt/conda/conda-bld/interproscan_1610978612258/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/share/InterProScan/initial_setup.py", line 57, in <module>
14:26:23 BIOCONDA INFO (OUT)     hmmer33_dir = ipr_properties['binary.hmmer33.path']

while binary.hmmer33.path is not anymore in the default interproscan.properties file. A problem to correct in your release? I will add it in my interproscan.properties forn now

gsn7 commented 3 years ago

i don't see why you should get that error. binary.hmmer33.path is in the interproscan.properties file. $ grep 'binary.hmmer33.path' interproscan.properties binary.hmmer33.path=${bin.directory}/hmmer/hmmer3/3.3 from previous communication, you only have one version of hmmer3, while the release has two for now. This is okay having one version and you can point binary.hmmer33.path to the version you have

Juke34 commented 3 years ago

Yesterday I shifted from 5.46 to 5.48 I realised that many things have changed in the properties file. e.g. path to esl-translate is new hmmer33 parameters disapeared etc .

I got the properties file from the 5.48-83 I downloaded yesterday. But maybe I was not looking at the proper location. Where is located the one I should use as template?

gsn7 commented 3 years ago

the properties file to use is core/jms-implementation/support-mini-x86-32/interproscan.properties and if there are properties missing in that file we will add them

Juke34 commented 3 years ago

I think this one is missing binary.esltranslate.path= ( It was in the previous interproscan.properties file I was using ) After copied and filled up the core/jms-implementation/support-mini-x86-32/interproscan.properties to put within the conda recipe I got this error:

Error output from binary:
22:06:11 BIOCONDA INFO (OUT) bin/nucleotide/esl-translate: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by bin/nucleotide/esl-translate

It should use the one included and compiled by conda delivered within hmmer. So I added: binary.esltranslate.path=esl-translate. I hope this line will be taken into consideration by IPRscan

lecorguille commented 3 years ago

Hi, Still interest in! Any progress since :/

Juke34 commented 3 years ago

Salut ! All related information:

The remaining problems now are

The last I was tring to solve is suppose to be simple link the bin into the propertie file: I tried:

Looking a way to set the path in the interproscan.properties properly

lecorguille commented 3 years ago

Salut ^^'

Naively, I will suggest to patch initial_setup.py using source -> patches

_initialsetup.py.patch

--- initial_setup.py    2021-04-12 15:46:14.000000000 +0200
+++ initial_setup.py    2021-04-19 09:16:02.859214270 +0200
@@ -53,10 +53,8 @@

 if __name__ == "__main__":
     ipr_properties = load_properties('interproscan.properties')
-    hmmer3_dir = ipr_properties['binary.hmmer3.path']
-    hmmer33_dir = ipr_properties['binary.hmmer33.path']
-    hmmpress_path = hmmer3_dir + '/hmmpress'
-    hmmpress33_path = hmmer33_dir + '/hmmpress'
+    hmmpress_path = 'hmmpress'
+    hmmpress33_path = 'hmmpress'
     hmm_models_paths = get_hmm_models_props(ipr_properties)
     if (len(hmm_models_paths) > 0):
         print("Checking any hmm models that need indexing ... this may take a few minutes")
source:
  patches:
    - initial_setup.py.patch

Is it an issue to not distinguish hmmer 3.1 and 3.3 as it is originally?

Juke34 commented 3 years ago

Is it an issue to not distinguish hmmer 3.1 and 3.3 as it is originally?

Nope there is no more difference, they will keep only one parameter in the future to simplify. You are right the easiest for now is to patch initial_setup.py. But I would do it slighly differently in case they use hmmer33_dir elsewhere. I will give a try. We can ask later IPR to integrate the modification.

gsn7 commented 3 years ago

there are two databases (superfamily, sfld) that require use of hmmer-3.1 for indexing as the models includes duplicate model names and 3.3 is sensitive to that such it will not index them, while hmmer-3.1 works fine.

fingerPRINTScan is a problem case as its old c++ code that requires refactoring to compile on newer compilers.

Juke34 commented 3 years ago

Problem for hmmer path (for hmmpress) fixed now. Only fingerPRINTScan need to be fixed to have a working version. For fingerPRINTScan I made the recipe last year with a patch thinking it was working fine. See here: https://github.com/bioconda/bioconda-recipes/tree/master/recipes/fingerprintscan @gsn7 could you have a look to see how we can fix it?

To be fully operational (even with fingerPRINTScan fixed) @gsn7 confirmed that we need hmmer-3. So it raises a problem because currenlty it is not possible to get hmmer-3.1 and hmmer-3.3 in parallel vi conda. We need to create a new recipe that install hmmer-3.1 exe in a dedicated folder.

LeeBergstrand commented 3 years ago

@SilasK Is this something thats in your capabilities to fix? I would be interested in getting IPR5 installable through Conda so I can add it to ATLAS as a genome annotation option and have this stream down too Micromeda. I don't know that much about creating recipes.

Problem for hmmer path (for hmmpress) fixed now. Only fingerPRINTScan need to be fixed to have a working version. For fingerPRINTScan I made the recipe last year with a patch thinking it was working fine. See here: https://github.com/bioconda/bioconda-recipes/tree/master/recipes/fingerprintscan @gsn7 could you have a look to see how we can fix it?

To be fully operational (even with fingerPRINTScan fixed) @gsn7 confirmed that we need hmmer-3. So it raises a problem because currenlty it is not possible to get hmmer-3.1 and hmmer-3.3 in parallel vi conda. We need to create a new recipe that install hmmer-3.1 exe in a dedicated folder.

Juke34 commented 3 years ago

First recipe merged now see #22802