Closed twagner9 closed 10 months ago
Hi @twagner9
Just out of curiosity, did you try with the --gpus flag specified? As I understand it "--gpus all" instructs the container make all gpus visible to the host also visible to the container. But if you don't have any visible GPUs, wouldn't it be no harm no foul?
Anyway, the "all" is the argument taken by the "--gpus" flag. It is admittedly a little goofy the way that the .json is parsed. So if you were running this on the command line, it would look like:
cmd --rm --gpus all -it --net host -e DISPLAY=${env:DISPLAY} ....
So if you remove only all, you are giving docker the --gpus arg without any value (though I'm surprised it wouldn't default to "all".
Alternatively, if you remove just the "--gpus" then I it looks like docker interprets "all" as a specific container to remove, rather than simply removing the container that is being created at runtime with this given cmd.
Hi @bHimes
Anyway, the "all" is the argument taken by the "--gpus" flag.
This I did not realize; it makes sense why only removing the flag would cause the initially mentioned issue.
Just out of curiosity, did you try with the --gpus flag specified
Yes, I initially ran it as-is, and that gives the following error:
[2023-12-19T20:32:29.719Z] docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
On my desktop, which has a dedicated GPU, this was resolved by installing the required nvidia drivers, if I recall correctly. But will this simply never work with integrated graphics, or should I see if I need drivers specific to my device so this flag can find the appropriate value?
Thanks for the details @twagner9
My follow up and at the bottom a test for you to run (I don't have any machines w/o dedicated graphics.)
Additional background:
So, design wise, I think the container should be able to build either the cpu only code or the cpu/gpu code whether or not the host machine has a GPU or nvidia driver. (Cuda toolkit is required tho)
The extra toolkit for docker is really only needed if we want to test and run code with gpu enabled inside the container, which I do.
Tested change
1) Remove the devcontainer.json, runArgs "--gpus all"
2) Add a different specification to devcontainer.json
"hostRequirements": {
"gpu": true
},
"runArgs": [
"--rm",
"-it",
...
3) Change the bool to a string "optional"
"hostRequirements": {
"gpu": "optional"
},
"runArgs": [
"--rm",
"-it",
...
TODO
Test the "optional" approach on your laptop please.
Test Result
@bHimes I have tested both variations of "hostRequirements"
out of curiosity, and interestingly, both allow me to open the container with no apparent issues.
Okay, I'll set it to the optional variant, and note that something unexpected may be going on under the hood. It would have be nicer if gpu: true crashed on your machine.
I'll setup a new container with the additional test data and this modification that we can test "clean."
ty
When attempting to open the cisTEM code in the dev container, it will not successfully open without removal of
"all"
from runArgs in devcontainer.json, giving the following errors in the output:Just as a mention, I also have to remove
"--gpus"
from runArgs on my device that does not have a dedicated graphics card -- though this is expected.Removal of all (and gpus in my case) does allow successful opening of the project in the container.