Closed Skier23 closed 3 months ago
I get the same error if I use the exact code and config from here: https://developer.nvidia.com/blog/accelerating-inference-with-triton-inference-server-and-dali/
Hi @Skier23,
Can you tell us how you run the TRITON server? Which OS and GPU do you use? It seems that NVML, which is part of the CUDA driver cannot be accessed.
Sure, this is the docker command I use to launch the triton server. I have been using this command previously and running benchmarks on the Triton server model and its definitely using GPU/CUDA:
docker run --gpus all --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.03-py3 tritonserver --model-repository=/models
GPU: 3090 OS: Launched the docker container from Windows but that shouldnt matter since its a linux docker container.
Hi @Skier23,
In this case, this is the problem with NVML support on the Windows platform.
To run DALI inside WSL please also pass DALI_DISABLE_NVML=1
to the docker env to make sure DALI doesn't try to use it (-e DALI_DISABLE_NVML=1
added to docker invocation cmd).
Why would it be related to windows when its running a Linux docker container? The host platform shouldnt matter since thats the whole point of a docker container right?
Why would it be related to windows when its running a Linux docker container? The host platform shouldnt matter since thats the whole point of a docker container right?
The nvml is part of the driver which is exposed from the host by the docker runtime. nvml is just not fully supported inside WSL.
Its not supported on windows either then? It looks like this documentation suggests it can be installed on windows at least and is used for nvidia-smi: https://docs.nvidia.com/deploy/nvml-api/nvml-api-reference.html
It does look like using that environment variable to disable it allows it to load the model on windows though.
The wsl is a special case, please check this part of the release note.
Right, but I was just trying to run it natively on windows. (via the linux docker container) Is that case also not supported?
Right, but I was just trying to run it natively on windows. (via the linux docker container) Is that case also not supported?
To my best understanding docker on windows uses WSL under the hood - https://docs.docker.com/desktop/install/windows-install/ - Turn on the WSL 2 feature on Windows.
.
Gotcha, that makes more sense then. Thanks!
Can confirm the same problem on my own system using Docker. Unfortunate.
@jet082 ,
Does any of the above suggestions by @JanuszL help in your case? If not, please provide some details about your system configuration, so we'd be able to help you.
@jet082 ,
Does any of the above suggestions by @JanuszL help in your case? If not, please provide some details about your system configuration, so we'd be able to help you.
I added DALI_DISABLE_NVML=1 and it works, but I think that means that this step isn't using the GPU? Am I incorrect about that? Still learning this stuff.
Hi @jet082,
but I think that means that this step isn't using the GPU?
This extra variable disables a certain functionality which may lead to a small drop in performance. In most cases, it should be negligible, but still, the GPU is available and should be used.
I'm trying to follow along with the simple tutorials to get a dali preprocessing pipeline in place with triton but getting some errors when trying to do so:
dali.py:
I get the same/similar error when using a serialized dali mode