Open josegonzalez opened 4 years ago
Hey @josegonzalez, we document the driver version on this page
https://docs.aws.amazon.com/eks/latest/userguide/eks-linux-ami-versions.html
Is this what you are looking for?
Ah thats great! Would it make sense to update the docs for the "Example GPU manifest" to reference the supported cuda version? I believe that would be 10.1 based on the nvidia driver version, but currently the docs show 9.2 usage.
The CUDA compatibility is documented here https://docs.nvidia.com/deploy/cuda-compatibility/ - using the NVIDIA data center driver version that we supply in the EKS AMI release notes you can cross reference to find the compatible CUDA versions (supplied in your container)
libcuda.so
(see figure 1 from the link above) is installed on the EKS optimized GPU AMI for the NVIDIA driver as part of the driver installation - the the version of CUDA that users are typically interested in is the version within their container image that is used by their application. Some application frameworks like PyTorch will provide the CUDA libraries they require either when installing via pip
or when using the PyTorch supplied container images (ref 1, ref 2)
Community Note
Tell us about your request
It would be great if you could document the version of the Nvidia Drivers supported by the GPU-optimized images. Browsing here gives me no real clue as to what they might be, which makes it more difficult to support folks writing cuda apps.
For those who aren't aware, Cuda version is tied to Nvidia Driver version.
Which service(s) is this request for?
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
To figure this out now, I need to provision a given image to EC2 and run
nvidia-smi
. This is automateable, but annoying and expensive. Additionally, the underlying image AWS ships can change over time, meaning this must be done on a regular basis.Are you currently working around this issue?
Currently I'm going to eat some ice cream - drumsticks! - but will likely do my suggestion above to suss out what we can/cannot support out of the box (and then work backwards to get the version that supports 10.1).
Additional context
Ya'll are pretty great!
Attachments
Not related to this issue, but in case you needed something to brighten your day, here is a pic of my cat sunbathing.