Closed mrocklin closed 2 years ago
Matthew Rocklin @.***> writes:
So I'm playing with RAPIDS 2021.06, which requires a fairly recent CUDA driver. I create a software environment as follows
import coiled # Create a software environment with GPU accelerated libraries # and CUDA drivers installed coiled.create_software_environment( name="rapids", container="gpuci/miniconda-cuda:11.2-runtime-ubuntu20.04", conda={ "channels": [ "rapidsai", "nvidia", "conda-forge", "defaults", ], "dependencies": [ "rapids=21.06", "cudatoolkit=11.2", "cupy", "python=3.8", ], }, pip=["afar"], )
I'm finding that things work, but oddly...
I'm curious if there is maybe a driver mismatch. Do we have to match anything on the VM to the image?
I believe the VM needs the underlying CUDA drivers to match, yes. I seem to recall the base Ubuntu VMs we're using only have CUDA 10? @selshowk may know more.
If so, would it be easy to change this to CUDA 11? Would it be easy to specify dynamically?
Yes right now we're using cudatoolkit=10.2 (I think!). Pretty sure we can switch it to 11 by installing different packages on the VM. Making it dynamical is more tricky because we don't have a way to specify that now in the API and because the cuda version is baked into the AMIs we build. If we add some semantics in the API for multiple cuda versions (e.g. tied to the gpu
flag) then we could, in principle, build multiple AMIs to support different cuda versions.
Short-term I would welcome VMs with version 11.
On Mon, Jul 19, 2021 at 2:07 PM selshowk @.***> wrote:
Yes right now we're using cudatoolkit=10.2 (I think!). Pretty sure we can switch it to 11 by installing different packages on the VM. Making it dynamical is more tricky because we don't have a way to specify that now in the API and because the cuda version is baked into the AMIs we build. If we add some semantics in the API for multiple cuda versions (e.g. tied to the gpu flag) then we could, in principle, build multiple AMIs to support different cuda versions.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/coiled/feedback/issues/149#issuecomment-882860105, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTHW2XX2OWL3EPRXJNDTYSH2TANCNFSM5ARPKCXQ .
The GPU support I'm adding now will use AMI with new drivers and CUDA (maybe CUDA 11.7 which just came out; this should work for any code built against any 11.x)
So I'm playing with RAPIDS 2021.06, which requires a fairly recent CUDA driver. I create a software environment as follows
I'm finding that things work, but oddly...
I'm curious if there is maybe a driver mismatch. Do we have to match anything on the VM to the image?