GPUs are quite a bit different as they often involve proprietary drivers and their integration into container ecosystem is still a bit bespoke. I've successfully ran a Tensorflow application involving CUDA and cuDNN inside a Docker container on a AWS virtual machine that has an NVIDIA GPU attached. It was quite confusing.
It appears they created a hook into the runc system to provide some level of indirection when running containers that require access to GPUs. I have not yet tried this, but I think this adds some amount of real world complexity to stress test such a use case on the Matrix system.
This is low priority for now, and should be in the backlog.
GPUs are quite a bit different as they often involve proprietary drivers and their integration into container ecosystem is still a bit bespoke. I've successfully ran a Tensorflow application involving CUDA and cuDNN inside a Docker container on a AWS virtual machine that has an NVIDIA GPU attached. It was quite confusing.
However there is some automation that NVIDIA is providing: https://devblogs.nvidia.com/gpu-containers-runtime/
It appears they created a hook into the
runc
system to provide some level of indirection when running containers that require access to GPUs. I have not yet tried this, but I think this adds some amount of real world complexity to stress test such a use case on the Matrix system.This is low priority for now, and should be in the backlog.