algorithmiaio / langpacks

Standardized builder and runners for Algorithmia algorithms
https://algorithmia.com/
MIT License
16 stars 9 forks source link

[ALGO-935] Create Tensorflow 2.4 Python 3.8 environment with updated base image [WIP] #188

Closed aslisabanci closed 3 years ago

aslisabanci commented 3 years ago

Couldn't test this on deep purple yet, a bit frustrated with the errors I've been getting so far. I'm opening this PR for your reviews, in case you notice something missing. Any help to test these is also appreciated to make things faster.

aslisabanci commented 3 years ago

Trying to validate the environment with this command: ./tools/environment_validator.py -b nvidia/cuda:11.2.0-cudnn8-devel-ubuntu20.04 -g python3 -s python38 -d tensorflow-gpu-2.4 -t dependency -n tensorflow-gpu-2.4 --nvidia-support 1

fails with the following error on deep purple (where we have CUDA 10.2 installed) docker.errors.APIError: 500 Server Error for http+docker://localhost/v1.40/containers/8914501b9ed4ebd13c13dd2d79c053ae88a6abcf2f17975382d3ee720cae1fea/start: Internal Server Error ("OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.2, please update your driver to a newer version, or use an earlier cuda container\\\\n\\\"\"": unknown")

Not progressing with publishing this on test until we address this.

aslisabanci commented 3 years ago

With the updated CUDA drivers, we can now validate this package as: ./tools/environment_validator.py -b nvidia/cuda:11.0.3-cudnn8-runtime-ubuntu20.04 -g python3 -s python38 -d tensorflow-gpu-2.4 -t dependency -n tensorflow-gpu-2.4 --nvidia-support 1

Things to note above:

aslisabanci commented 3 years ago

Deleting the branch and the PR as they're already merged into develop from another branch by Daniel.