microsoft / vscode-dev-containers

NOTE: Most of the contents of this repository have been migrated to the new devcontainers GitHub org (https://github.com/devcontainers). See https://github.com/devcontainers/template-starter and https://github.com/devcontainers/feature-starter for information on creating your own!
https://aka.ms/vscode-remote
MIT License
4.7k stars 1.41k forks source link

Specifying specific GPUs for the devcontainer. #1741

Open bjajoh opened 1 year ago

bjajoh commented 1 year ago

Hi,

I encountered an issue while trying to run two devcontainers on the same system (4 GPUs) and tried to allocate 2 GPUs to each.

This works and does exactly what I want: docker run --gpus '"device=2,3"' nvidia/cuda:9.0-base nvidia-smi

This also works as documented for the devcontainer:

"runArgs": [
    "--gpus",
    "device=2"
],

However when trying to use two specific GPUs, it fails.

"runArgs": [
    "--gpus",
    "device=2,3"
],

I think this might be due to the way docker is expecting this device string in case of multiple GPUs.

jkeech commented 1 year ago

@bjajoh I haven't tried this myself, but I wonder if adding some escaped quoted to your runArgs would work? If docker run is expecting quotes around the device=2,3 value, then you should be able to put those quotes in the runArgs array item. Something like the following might work:


"runArgs": [
    "--gpus",
    "'\"device=2,3\"'"
],
bjajoh commented 1 year ago

@jkeech It sadly dows not work: invalid argument "'\"device=2,3\"'" for "--gpus" flag: parse error on line 1, column 2: bare " in non-quoted-field Do you have any other idea how to solve this?

jkeech commented 1 year ago

There's probably some amount of nested escaping of the quotes in the devcontainer.json that needs to happen to end up with the docker process spawn args having the desired quotes. I would play around with different levels of quoting and quote escaping in the runArgs value.

ryxli commented 1 year ago

Also encountering this issue. Alternative might be to set the env variable NVIDIA_VISIBLE_DEVICES instead of passing through --gpus. Although it would be nice to be able to know how escaping the quote would work

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html#gpu-enumeration

    "containerEnv": {
        "NVIDIA_VISIBLE_DEVICES": "0,1"
    },
    "runArgs": [
        "--runtime=nvidia"
    ]
bjajoh commented 1 year ago

@ryxli It actually works! Thank you so much!