Open EdIzaguirre opened 3 days ago
@EdIzaguirre - @conda
and @pypi
provide a clean virtual environment. The default tensorflow
conda package is not GPU-compatible; have you tried the tensorflow-gpu
package instead?
When I try using the tensorflow-gpu
package instead of tensorflow
in the @conda decorator, I get:
Micromamba ran into an error while setting up environment:
command '/Users/ed/.metaflowconfig/micromamba/bin/micromamba create --yes --quiet --dry-run --no-extra-safety-checks --repodata-ttl=86400 --retry-clean-cache --prefix=/var/folders/y6/vkmmb9lj41q4_0dq4q_h0xyh0000gn/T/tmprobsr8sm/prefix --channel=conda-forge --channel=Microsoft --channel=defaults requests==>=2.21.0 boto3==>=1.14.0 tensorflow-gpu==2.6.0 python==3.12' returned error (1)
nothing provides __glibc >=2.17 needed by tensorflow-base-2.6.0-cuda110py37hb8f09f9_2
. Trying to include the rmg::glibc==2.19
package from conda doesn't fix this. Notably, when I use the @pypi
decorator with the tensorflow
library, I do see a GPU. However, one of my packages doesn't work with @pypi, so I would like to use the @conda decorator.
You may be able to try CONDA_OVERRIDE_GLIBC=2.17 as an env var. You can also try the bleeding edge decorators that allow you to combine conda and pypi (see here: https://docs.metaflow.org/scaling/dependencies/libraries#bleeding-edge-versions-of-the-decorators) and also handle the GLIBC notion a little bit differently. +1 to @savingoyal 's point about package name and the "clean slate". Package names usually match between pypi and conda but that is not always the case; conda still distinguishes the GPU version (it basically adds additional dependencies). Feel free to come on slack too for a more interactive conversation. There is a similar question that was asked there in the last two weeks iirc.
Hello,
As the title mentions, I am trying to get a job run on AWS Batch that will run on a GPU. When I omit the @conda/@pypi decorator, I am able to see that a GPU is allocated. However, when I throw on an @conda/@pypi decorator, I am unable to get a GPU appear. Why is this? For reference, I am essentially using this Cloud Formation template.
Here is some simple code to demonstrate the issue. Again this occurs whether I use @pypi or @conda. I am using MetaFlow version 2.12.5. Note that if I omit the tensorflow library in the @conda decorator, I get a
ModuleNotFoundError: No module named 'tensorflow'
, so the @conda/@pypi decorators seems to wipe the Tensorflow library from the compute instance.For reference, this is the output: