Rewrite Dockerfile - Githubissues

gb-beng commented 2 months ago

Align with visxp worker
~Let torch handle cuda runtime library (versions)~ This was a misunderstanding, faster-whisper doesn't use torch.
Get poetry out of the deployed image

jblom commented 2 months ago

@gb-beng we'll definitely wait with this one, since @greenw0lf mentioned having trouble with other setups. When we've tackled the deployment in ocp and using Airflow we can have a go with this one

gb-beng commented 2 months ago

Ok, I did test it locally and everything worked smoothly; as for visxp worker.

jblom commented 2 months ago

@gb-beng ah good news. Now it's a matter of seeing how it runs in ocp (I assume you did not run it there)

gb-beng commented 1 month ago

Some learnings:

You can manage CUDA libraries via pip, which I implemented in https://github.com/beeldengeluid/whisper-asr-worker/pull/67/commits/9630d88521e9135fb10d25fdbef29c0ad60cf103. I can run this version on the GPU but you'll need to manually set LD_LIBRARY_PATH, per https://github.com/SYSTRAN/faster-whisper/blob/d57c5b40b06e59ec44240d93485a95799548af50/README.md?plain=1#L101 to get it to work.

Because of this I think the nvidia cuda base images are a more solid approach, but I still would like to take out poetry to align with our other containers.

Notes:

Running on a node without (configured) GPU

python
Python 3.10.14 (main, Jul 23 2024, 07:20:09) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from faster_whisper import WhisperModel
>>> whisper_model = WhisperModel("medium", device="cuda", compute_type="float16")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 145, in __init__
self.model = ctranslate2.models.Whisper(
RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version

Note: @greenw0lf had this on Windows/WSL too

Running on a node with GPU still requires updating LD_LIBRARY_PATH, if not you'll see

python
Python 3.10.14 (main, Jul 23 2024, 07:20:09) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from faster_whisper import WhisperModel
>>> whisper_model = WhisperModel("medium", device="cuda", compute_type="float16")
>>> segments, info = whisper_model.transcribe("file.mp4", beam_size=5)
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
Aborted (core dumped)

After setting LD_LIBRARY_PATH, we can use the pip-installed libraries

export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`
python
Python 3.10.14 (main, Jul 23 2024, 07:20:09) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> whisper_model = WhisperModel("medium", device="cuda", compute_type="float16")
>>> segments, info = whisper_model.transcribe("file.mp4", beam_size=5)
>>> print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
Detected language 'en' with probability 0.972656
>>> segments = list(segments)
>>>

beeldengeluid / whisper-asr-worker

Rewrite Dockerfile #67