Open DeoLeung opened 6 months ago
+1 for this, why TEI request cuda12.2, while torch is still with cuda12.1, and there is no nv-driver compatible for both 12.1/12.2. https://docs.nvidia.com/deploy/cuda-compatibility/index.html
and there is no nv-driver compatible for both 12.1/12.2
If you are upgrading the driver to 525.60.13 which is the minimum required driver version for the 12.x toolkits, then 11.x and 12.x applications will be supported due to backward compatibility and future 12.x applications will be supported due to minor-version compatibility.
You can run both torch and TEI with >=525.60.13.
@DeoLeung do you have issues running TEI on a server with >=525.60.13?
and there is no nv-driver compatible for both 12.1/12.2
If you are upgrading the driver to 525.60.13 which is the minimum required driver version for the 12.x toolkits, then 11.x and 12.x applications will be supported due to backward compatibility and future 12.x applications will be supported due to minor-version compatibility.
You can run both torch and TEI with >=525.60.13.
@DeoLeung do you have issues running TEI on a server with >=525.60.13?
we managed to do a server upgrade and it's now running fine on NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.4
and there is no nv-driver compatible for both 12.1/12.2
If you are upgrading the driver to 525.60.13 which is the minimum required driver version for the 12.x toolkits, then 11.x and 12.x applications will be supported due to backward compatibility and future 12.x applications will be supported due to minor-version compatibility.
You can run both torch and TEI with >=525.60.13.
@DeoLeung do you have issues running TEI on a server with >=525.60.13?
othx, maybe I should try downgrade to 525.60.13 I'm with 530.30.02 now, it seems not compatible from the table in the link starting the tei docker will report a error : @OlivierDehaene
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.2, please update your driver to a newer version, or use an earlier cuda container: unknown.
Feature request
not sure if the current cuda 12.2 is a hard limit or could be optional
if optional, could parametrize the docker base image so user could re-build it easily
Motivation
some server just hard to upgrade the cuda version
Your contribution
parametrized the dockerfile and rebuild it if necessary on users' end