OCR-D / ocrd_all

Master repository which includes most other OCR-D repositories as submodules
MIT License
72 stars 17 forks source link

Docker: multi-version CUDA #270

Closed bertsky closed 2 years ago

bertsky commented 3 years ago

Implements #263

bertsky commented 3 years ago

Too bad: This currently yields CUDA_ERROR_SYSTEM_DRIVER_MISMATCH in Tensorflow. Should have checked earlier (in core-cuda)...

bertsky commented 3 years ago

Too bad: This currently yields CUDA_ERROR_SYSTEM_DRIVER_MISMATCH in Tensorflow. Should have checked earlier (in core-cuda)...

It seems that the choice nvidia/cuda:11.3.1-cudnn8-runtime-ubuntu18.04 as base image now requires at least nvidia-driver-470 on the host system. I have 440 and 465 on systems available to me, neither of them can work the image. But that means we are making a sacrifice here: to be able to support the newest Tensorflow/CUDA as well, we are forcing all host systems to get a newer driver. (It just might be that upgrading the driver is easier than upgrading CUDA. But it's still quite inconvenient.)

bertsky commented 3 years ago

If you have the Nvidia repo source, you can just update cuda-drivers-470 which will take care of all dependencies. (But a fresh installation might work, too.)

Anyway, this does work (based on a locally built ocrd/core-cuda from https://github.com/OCR-D/core/pull/704).

for venv in /usr/local/sub-venv/headless-tf*; do . $venv/bin/activate && python -c "import tensorflow as tf; print(tf.test.is_gpu_available())"; done

– yields True 3x

bertsky commented 2 years ago

Conflicting files

core

How are you supposed to keep PRs alive which involve subrepos then? I guess I'll have to update https://github.com/OCR-D/core/pull/704 each time core master changes, and then in turn update here.

bertsky commented 2 years ago

So to sum up, we have two drawbacks here:

But

I'd say let's merge!