galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.39k stars 999 forks source link

Conda resolver needs to know about all dependencies for a job #3299

Closed mvdbeek closed 7 years ago

mvdbeek commented 7 years ago

to make decisions about which packages are compatible. We're currently passing the dependencies in one-by-one, but packages that have alternative builds (such as py27 or py35) may need to be selected based on other packages. Not doing so will (infrequently) result in errors like this:

galaxy.tools.deps.conda_util DEBUG 2016-12-10 17:53:40,928 Executing command: /home/mvandenb/miniconda3/bin/conda list --name __lumpy-sv@0.2.12 --export > /tmp/tmpO2oyhZ/tmp/jobdepsJZJK7V34abab649380d2c7cf2eee26205b378411814559b85da38efdd32a84bb2384a7/__lumpy-sv@0.2.12
requests.packages.urllib3.connectionpool INFO 2016-12-10 17:53:41,243 Starting new HTTP connection (1): localhost
galaxy.tools.deps.conda_util DEBUG 2016-12-10 17:53:41,303 Executing command: /home/mvandenb/miniconda3/bin/conda create -y --unknown --offline --prefix /tmp/tmpO2oyhZ/job_working_directory/000/2/conda-env --file /tmp/tmpO2oyhZ/tmp/jobdepsJZJK7V34abab649380d2c7cf2eee26205b378411814559b85da38efdd32a84bb2384a7/__lumpy-sv@0.2.12 > /dev/null
requests.packages.urllib3.connectionpool DEBUG 2016-12-10 17:53:41,328 "GET /api/jobs/5729865256bc2525?key=cf3d7767f81e873f72defa3df4e796aa HTTP/1.1" 200 None
...........[                    ]|                                              [      COMPLETE      ]|###################################################| 100%
Using Anaconda API: https://api.anaconda.org
No psutil available.
To proceed, please conda install psutilgalaxy.tools.deps.conda_util DEBUG 2016-12-10 17:53:41,979 Executing command: /home/mvandenb/miniconda3/bin/conda list --name __samtools@1.3.1 --export > /tmp/tmpO2oyhZ/tmp/jobdepsypNZDdf795a38a38496af1ff491d0cf89e7e5d9fe737e5c36da3dd56c71ad608bb9b15/__samtools@1.3.1
galaxy.tools.deps.conda_util DEBUG 2016-12-10 17:53:42,352 Executing command: /home/mvandenb/miniconda3/bin/conda install -y --unknown --offline --prefix /tmp/tmpO2oyhZ/job_working_directory/000/2/conda-env --file /tmp/tmpO2oyhZ/tmp/jobdepsypNZDdf795a38a38496af1ff491d0cf89e7e5d9fe737e5c36da3dd56c71ad608bb9b15/__samtools@1.3.1 > /dev/null
requests.packages.urllib3.connectionpool INFO 2016-12-10 17:53:42,780 Starting new HTTP connection (1): localhost
...........[                    ]|                                              [      COMPLETE      ]|###################################################| 100%
requests.packages.urllib3.connectionpool DEBUG 2016-12-10 17:53:42,858 "GET /api/jobs/5729865256bc2525?key=cf3d7767f81e873f72defa3df4e796aa HTTP/1.1" 200 None
[      COMPLETE      ]|###################################################| 100%
Using Anaconda API: https://api.anaconda.org
galaxy.tools.deps.conda_util DEBUG 2016-12-10 17:53:43,047 Executing command: /home/mvandenb/miniconda3/bin/conda list --name __numpy@1.11.2 --export > /tmp/tmpO2oyhZ/tmp/jobdepsdJ_g6t5f45538d2973e67c51bac84de0567db7e2f937401a29fe3980d2c57d4dc369ff/__numpy@1.11.2
galaxy.tools.deps.conda_util DEBUG 2016-12-10 17:53:43,224 Executing command: /home/mvandenb/miniconda3/bin/conda install -y --unknown --offline --prefix /tmp/tmpO2oyhZ/job_working_directory/000/2/conda-env --file /tmp/tmpO2oyhZ/tmp/jobdepsdJ_g6t5f45538d2973e67c51bac84de0567db7e2f937401a29fe3980d2c57d4dc369ff/__numpy@1.11.2 > /dev/null
..Using Anaconda API: https://api.anaconda.org
...The following specifications were found to be in conflict:
  - lumpy-sv -> python 2.7*
  - mkl 11.3.3 0
Use "conda info <package>" to see the dependencies for each package.

This is for a python2.7 dependency when using miniconda3. Using all dependencies at once (conda create -n test_lumpy lumpy-sv numpy) works fine.

mvdbeek commented 7 years ago

from gitter

Nicola Soranzo @nsoranzo Dez. 08 22:28
@mvdbeek I just noticed that when a tool has multiple requirements, the same Conda env is activated multiple times in tool_script.sh (both with and without cached envs). Should that be fixed?
Marius van den Beek @mvdbeek 17:41
@nsoranzo absolutely, I think this is also (kind-of) linked to galaxyproject/galaxy#3299
I think we should aim to build the environment in one go, then we'd only need to activate it once

Nicola Soranzo @nsoranzo 17:48
Indeed linked to your issue. What about the case where some requirements of a tool are Conda and others are resolved by a TS package or other resolvers?

Marius van den Beek @mvdbeek 17:58
Yeah, this is a bit tricky, I'm not sure how to handle this. Perhaps we first resolve the dependencies (so that we know what is conda and what is not), and then group the conda dependencies and activate that group?

Nicola Soranzo @nsoranzo 18:55
Could work. Also, the cache is now implemented at the DependencyManager level and it works because Conda is the only resolver that does the caching, but will it still work if other resolvers do (e.g. brew)?

Nicola Soranzo @nsoranzo 18:55
Or should be moved at the DependencyResolver level?
...
Nicola Soranzo @nsoranzo 19:10
DependencyManager already see multiple deps at the same time (requirements_to_dependencies() and dependency_shell_commands() methods)

Marius van den Beek @mvdbeek 19:10
@nsoranzo
Could work. Also, the cache is now implemented at the DependencyManager level and it works because Conda is the only resolver that does the caching, but will it still work if other resolvers do (e.g. brew)?
If the brew resolver implements a build_cache method and sets cacheable to True I don't see why it wouldn't work

Nicola Soranzo @nsoranzo 19:11
@mvdbeek I mean if a tool has mixed Conda and brew deps
Crazy, I know 

Marius van den Beek @mvdbeek 19:12
that depends if the dependencies can be cached atomically, or if you would need to group them together
but yes, I think working with the DependencyManager is probably the way to go. I think avoiding to activate the conda env multiple times would be a one-two line change in the dependency_shell_commands function

Nicola Soranzo @nsoranzo 19:20
To clarify, I was thinking that sharing the same hashed_dependencies_dir between Conda and another resolver may be a problem

Marius van den Beek @mvdbeek 19:22
I didn't think about that ... yes, if it is possible at all, it's probably not a good idea to share it.

Nicola Soranzo @nsoranzo 19:26
So we need to modify the CachedDependencyManager to deal with sets of requirements grouped by resolver, each having a separate hashed_dependencies_dir, correct?

Marius van den Beek @mvdbeek 19:27
that's what I am thinking too, yes

That said, I think we have to make sure that the correct combination of packages is selected at install time as well, not sure if we can use DependencyManager for this ...

mvdbeek commented 7 years ago

After some thinking I'm afraid this is not an easy job, given that for the most part the resolver system is designed for series of single packages with no dependencies between these packages.

Maybe the solution is to aggressively use metapackages for this. For conda we can certainly build an additional layer to deal with dependencies between packages, but if we think about containerized dependencies -- they have to be metapackages anyway, right? In that case we can save ourselves the trouble of implementing galaxy-specific code, and instead ask tool authors to create metapackages if their tool requires a combination of dependencies. On top of that we wouldn't need to build (cached or not) job-specific environments anymore, it would suffice to activate the (meta)package environment.

Do you have any thoughts on this @bgruening @jmchilton @nsoranzo ?

bgruening commented 7 years ago

@mvdbeek a few month ago I had a long discussion with @jmchilton about this. @jmchilton do you remember where we had it?

What I can recall is that I also proposed single dependencies for conda and solving this by using metadata packages, simply because this is easier to map to Docker/Singularity containers.

After some disagreement of how many metapackages are really needed, we agreed to implement some on the fly container generation. So what is possible no is that Galaxy takes a multiple conda dependency list and creates on the fly a container (name is a normalized hash of the requirements) and uses this. Such a container could also be created by the TS and pushed as biocontainer at some point.

What I also wanted to point out is, that we do not have any concept until now howto really pinpoint versions, ala a requirements.txt file. Only such a file would be really reproducible and we should at least capture this information, or these metadata packages should more or less consist of such a strictly pinned requirements.txt file.

mvdbeek commented 7 years ago

I kind of like the requirements.txt idea, given that currently we can have conda packages that are broken for the latest build, but are working for older builds (bioperl comes to mind right now). So depending on when you installed a tool (or resolved its dependencies), it might or might not work. For fully reproducible environments this would be better. I hope for the most part this can be avoided by more testing on the conda/bioconda side, but supporting an optional requirements.txt would be neat. You'd still specify the dependencies like you do today, but if the resolver is conda, the requirements.txt file would take precedence. Is that something we could agree on?

jmchilton commented 7 years ago

So right now the dependency resolvers work this way:

matches = []
for each depenedency:
  for each resolver:
    if does it match?
       matches.append(match)

I guess we want to switch it up to be:

remaining_dependencies = set(dependencies)
matches = []
for each resolver:
   for each remain_dependency:
       if does it match?
          matches.append(match)
          remaining_dependencies.remove(dependency)

And then we can add a find_all_deps optional method to the dependency resolvers to give them the opportunity to match everything at once if they can:

remaining_dependencies = set(dependencies)
matches = []
for each resolver:
   if hasattr(resolver, 'find_all_deps'):
      if resolver.find_all_deps(remaining_dependencies):
           # conda does magic to create environment for all of them?
           matches.append_all(remaining_dependencies)
           remaining_dependencies = []
   for each remain_dependency:
       if does it match?
          matches.append(match)
          remaining_dependencies.remove(dependency)

Will this work and will this work with cached dependencies? I guess this is the first step - after we have the abstraction in place we can augment it with the requirements pinning - I'm less excited about that but it wouldn't hurt anything I guess.

mvdbeek commented 7 years ago

Will this work and will this work with cached dependencies?

I think it could work yes. What do we do if resolver 1 has 1 out of 3 dependencies and resolver 2 (cacheable) has 2 out of 3? I suppose the cached environment should be hashed on those 2 dependencies that could be resolved ... then we'd have to figure out whether a partial combination of hashes exists. Or we hash on all of the requirements, with the downside that we may produce more environments than necessary.

And we'd also have to think about the installation. I think that doing conda create -n <hash> valid_conda_package1 valid_conda_package2 toolshed_package would not work, so we first need to ask what part of the requirements can be installed by conda, and then use that subset.

jmchilton commented 7 years ago

I was imagining the conda resolver would just fallback if it couldn't do everything all at once. That is why the other for loop isn't in an else. I guess what I'm saying is that I think it is fine to just solve this problem for the subset of cases where all the requirements are in conda and not matched anywhere else (for instance tool shed). This subset will be "most" tools and for this subset I feel like we are respecting the dependency resolver conf while still allowing resolution across multiple dependencies at once. I think for this subset we also don't have to worry about combinations of subpackages and such. Combining tool shed and conda resolution is a recipe for disaster. Maybe to make this more explicit we could adjust the above pseudo code as:

remaining_dependencies = set(dependencies)
matches = []
for each resolver:
   if len(matches) == 0 and hasattr(resolver, 'find_all_deps'):
      if resolver.find_all_deps(remaining_dependencies):
           log.info("Ideal resolution - dependency resolver [%s] can resolve all dependencies." % resolver)
           # conda does magic to create environment for all of them!
           matches.append_all(remaining_dependencies)
           remaining_dependencies = []
   # Otherwise, just resolve what you can from remaining dependencies.
   for each remain_dependency:
       if does it match?
          matches.append(match)
          remaining_dependencies.remove(dependency)
mvdbeek commented 7 years ago

it is fine to just solve this problem for the subset of cases where all the requirements are in conda

fair enough, I think that would be good progress indeed. We still have a lot of tools around with dependencies specifying both the conda and the toolshed name, but if we get into this special sort of conflict we can just adjust or remove the toolshed dependency.

nsoranzo commented 7 years ago

Going from pseudo to real code, we are speaking about DependencyManager.dependency_shell_commands(), right? matches is a bit obscure to me.

mvdbeek commented 7 years ago

I think think that would be too late in DependencyManager.dependency_shell_commands(), we'll already need this at install time. I think this should be somewhere in the resolve() function (or we do a better separation between the resolve and install functionality).

mvdbeek commented 7 years ago

This has been addressed with #3391

lecorguille commented 7 years ago

Hi, Sorry to re-open this thread but I wanted to test the new implementation:

planemo test --conda_auto_init --conda_auto_install --conda_dependency_resolution --conda_prefix /tmp/mc --galaxy_branch dev ipo4xcmsSet.xml

Here are my requirements

And what I get: planemo.out

galaxy.tools.deps.conda_util DEBUG 2017-01-12 16:14:32,966 Executing command: /tmp/mc/bin/conda create -y --name mulled-v1-c88aa693b9ce3103fc0de021e39be27b2692565f765dbe5945a8d6aa8479c42d r-snow=0.4_1 r-ipo=1.7.5 r-batch=1.1_4
requests.packages.urllib3.connectionpool DEBUG 2017-01-12 16:14:33,049 "GET /api/jobs/5729865256bc2525?key=64f8023f372a8326d43b694c36e5ec1e HTTP/1.1" 200 None

PaddingError: Placeholder of length '80' too short in package r::r-base-3.3.1-3.
The package must be rebuilt with conda-build > 2.0.
  - bzip2 1.0.6 3
Use "conda info <package>" to see the dependencies for each package.

galaxy.tools.deps.conda_util DEBUG 2017-01-12 16:17:48,723 Executing command: /tmp/mc/bin/conda clean --tarballs -y
galaxy.jobs.runners ERROR 2017-01-12 16:17:49,187 (2) Failure preparing job
Traceback (most recent call last):
  File "/tmp/tmpVCld0W/galaxy-dev/lib/galaxy/jobs/runners/__init__.py", line 170, in prepare_job
    job_wrapper.prepare()
BLABLABLA
  File "/tmp/tmpVCld0W/galaxy-dev/lib/galaxy/tools/deps/resolvers/conda.py", line 354, in build_environment
raise DependencyException("Conda dependency seemingly installed but failed to build job environment.")

I think that the primo issue is this The package must be rebuilt with conda-build > 2.0.

Help 🙏

nsoranzo commented 7 years ago

@lecorguille You probably need a more recent version of r-base, 3.3.1-3 was built 5 months ago. Bioconda is considering moving to 3.3.2 in https://github.com/bioconda/bioconda-recipes/issues/3270 but this will take some time. @bgruening may give you a better advice.

lecorguille commented 7 years ago

Hi! Me again

Today, it's update day: galaxy==release_17.01 and planemo==1.37.0

But it's also a failed day 😢

planemo test --conda_auto_init --conda_auto_install --conda_dependency_resolution --conda_prefix /tmp/mc2 --galaxy_branch release_17.0

galaxy.tools.deps.conda_util: DEBUG: Executing command: /tmp/mc2/bin/conda create -y --name mulled-v1-e2c583a076381b4eb996277f6bdae5d56046739797eecaa9b313d7b88923ab29 r-snow=0.4_1 bioconductor-xcms=1.46.0 r-batch=1.1_4
[...]
PaddingError: Placeholder of length '80' too short in package bioconda::bioconductor-mzr-2.6.3-r3.3.1_0.
The package must be rebuilt with conda-build > 2.0.
[...]
galaxy.tools.deps.resolvers.conda DEBUG 2017-01-25 20:41:21,312 Removing failed conda install of [CondaTarget[r-snow,version=0.4_1], CondaTarget[bioconductor-xcms,version=1.46.0], CondaTarget[r-batch,version=1.1_4]]
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:21,313 Executing command: /tmp/mc2/bin/conda env remove -y --name mulled-v1-e2c583a076381b4eb996277f6bdae5d56046739797eecaa9b313d7b88923ab29
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:22,712 Executing command: /tmp/mc2/bin/conda create -y --name __r-snow@0.4_1 r-snow=0.4_1
galaxy.tools.deps DEBUG 2017-01-25 20:41:30,250 Using dependency r-snow version 0.4_1 of type conda
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:30,250 Executing command: /tmp/mc2/bin/conda create -y --name __bioconductor-xcms@1.46.0 bioconductor-xcms=1.46.0
galaxy.tools.deps DEBUG 2017-01-25 20:41:37,887 Using dependency bioconductor-xcms version 1.46.0 of type conda
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:37,887 Executing command: /tmp/mc2/bin/conda create -y --name __r-batch@1.1_4 r-batch=1.1_4
galaxy.tools.deps DEBUG 2017-01-25 20:41:45,878 Using dependency r-batch version 1.1_4 of type conda
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:45,879 Executing command: /tmp/mc2/bin/conda create -y --name mulled-v1-e2c583a076381b4eb996277f6bdae5d56046739797eecaa9b313d7b88923ab29 r-snow=0.4_1 bioconductor-xcms=1.46.0 r-batch=1.1_4
[...]
PaddingError: Placeholder of length '80' too short in package bioconda::bioconductor-mzr-2.6.3-r3.3.1_0.
The package must be rebuilt with conda-build > 2.0.

galaxy.tools.deps.resolvers.conda DEBUG 2017-01-25 20:41:54,464 Removing failed conda install of [CondaTarget[r-snow,version=0.4_1], CondaTarget[bioconductor-xcms,version=1.46.0], CondaTarget[r-batch,version=1.1_4]]
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:54,464 Executing command: /tmp/mc2/bin/conda env remove -y --name mulled-v1-e2c583a076381b4eb996277f6bdae5d56046739797eecaa9b313d7b88923ab29
galaxy.tools.deps DEBUG 2017-01-25 20:41:55,850 Using dependency r-snow version 0.4_1 of type conda
galaxy.tools.deps DEBUG 2017-01-25 20:41:55,850 Using dependency bioconductor-xcms version 1.46.0 of type conda
galaxy.tools.deps DEBUG 2017-01-25 20:41:55,850 Using dependency r-batch version 1.1_4 of type conda
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:55,851 Executing command: /tmp/mc2/bin/conda list --name __r-snow@0.4_1 --export > /tmp/jobdepstV0cQJa817a1d312c412c739d4c357c9343718c2068f458808f083afdf1f251d82564c/__r-snow@0.4_1
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:56,167 Executing command: /tmp/mc2/bin/conda create -y --unknown --offline --prefix /tmp/tmpLyEIGg/deps/_cache/2c9cb70f --file /tmp/jobdepstV0cQJa817a1d312c412c739d4c357c9343718c2068f458808f083afdf1f251d82564c/__r-snow@0.4_1 > /dev/null
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:57,384 Executing command: /tmp/mc2/bin/conda clean --tarballs -y
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:57,679 Executing command: /tmp/mc2/bin/conda list --name __bioconductor-xcms@1.46.0 --export > /tmp/jobdepsvU7EUs89b20b2c5915d075d4a0c07dfb65cfb04a50060d7fadd95d12895e582b914f79/__bioconductor-xcms@1.46.0
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:58,001 Executing command: /tmp/mc2/bin/conda install -y --unknown --offline --prefix /tmp/tmpLyEIGg/deps/_cache/2c9cb70f --file /tmp/jobdepsvU7EUs89b20b2c5915d075d4a0c07dfb65cfb04a50060d7fadd95d12895e582b914f79/__bioconductor-xcms@1.46.0 > /dev/null
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:58,690 Executing command: /tmp/mc2/bin/conda clean --tarballs -y
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:58,956 Executing command: /tmp/mc2/bin/conda list --name __r-batch@1.1_4 --export > /tmp/jobdeps0msmgPe7325fdd48bc9d36b864284fa0ff5f09769060a0bf098ef4ab24c33996193d03/__r-batch@1.1_4
galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:59,280 Executing command: /tmp/mc2/bin/conda install -y --unknown --offline --prefix /tmp/tmpLyEIGg/deps/_cache/2c9cb70f --file /tmp/jobdeps0msmgPe7325fdd48bc9d36b864284fa0ff5f09769060a0bf098ef4ab24c33996193d03/__r-batch@1.1_4 > /dev/null
...

UnsatisfiableError: The following specifications were found to be in conflict:
  - bzip2 1.0.6 3
Use "conda info <package>" to see the dependencies for each package.

galaxy.tools.deps.conda_util DEBUG 2017-01-25 20:41:59,749 Executing command: /tmp/mc2/bin/conda clean --tarballs -y
galaxy.jobs.runners ERROR 2017-01-25 20:42:00,017 (3) Failure preparing job
Traceback (most recent call last):
  File "/tmp/tmpLyEIGg/galaxy-dev/lib/galaxy/jobs/runners/__init__.py", line 170, in prepare_job
    job_wrapper.prepare()
  File "/tmp/tmpLyEIGg/galaxy-dev/lib/galaxy/jobs/__init__.py", line 971, in prepare
    self.dependency_shell_commands = self.tool.build_dependency_shell_commands(job_directory=self.working_directory)
  File "/tmp/tmpLyEIGg/galaxy-dev/lib/galaxy/tools/__init__.py", line 1388, in build_dependency_shell_commands
    tool_instance=self
  File "/tmp/tmpLyEIGg/galaxy-dev/lib/galaxy/tools/deps/__init__.py", line 244, in dependency_shell_commands
    self.build_cache(requirements, **kwds)
  File "/tmp/tmpLyEIGg/galaxy-dev/lib/galaxy/tools/deps/__init__.py", line 229, in build_cache
    [dep.build_cache(hashed_dependencies_dir) for dep in cacheable_dependencies]
  File "/tmp/tmpLyEIGg/galaxy-dev/lib/galaxy/tools/deps/resolvers/conda.py", line 335, in build_cache
    self.build_environment()
  File "/tmp/tmpLyEIGg/galaxy-dev/lib/galaxy/tools/deps/resolvers/conda.py", line 354, in build_environment
    raise DependencyException("Conda dependency seemingly installed but failed to build job environment.")
DependencyException: Conda dependency seemingly installed but failed to build job environment.

@bgruening Do you confirm that my only choice is to build a Conda meta-package?

jmchilton commented 7 years ago

@lecorguille was /tmp/mc2 setup with miniconda2? Might want to try with miniconda3? We've discovered other seemingly unrelated problems solved by switiching to miniconda 3 I guess. This probably won't work - but I had really hoped we had solved this and we had tried some older packages ☹️.

lecorguille commented 7 years ago

@jmchilton I let Planemo/Galaxy the conda installation. I'm already using miniconda3 @bgruening proposed to rebuild mzr. And I will tri to find the conflict. If you don't mind, I will inform you about my progress.

bgruening commented 7 years ago

It is rebuild already. Can you try if you get build 1 and one of these two error goes away?

lecorguille commented 7 years ago

Ok I will check that but maybe only tomorrow.

bgruening commented 7 years ago

@lecorguille just said that it is working now. We were able to fix it on the conda site. Everything fine here :)