Closed amadeuszsz closed 1 year ago
Can you see if #212 resolves your issue? Probably separate from the GPU option? It seems to provide a superset of what you need.
Noted same issue with Jetson Orin AGX but this PR fixes the problem.
Great, I'll take that one as it's more generic. Could you then rebase this and turn it into a PR to specifically address the ability to force the version of the nvidia flags?
Great, I'll take that one as it's more generic. Could you then rebase this and turn it into a PR to specifically address the ability to force the version of the nvidia flags?
Done - removed old unnecessary commit for clear history purpose and rebased. Is that what you meant or did you want two separated PRs for group-add
functionality and forcing nvidia flag?
I wanted to replace the --group-add functionality with --user-preserve-groups from #212 and then this would just be forcing the nvidia flag
I wanted to replace the --group-add functionality with --user-preserve-groups from https://github.com/osrf/rocker/pull/212 and then this would just be forcing the nvidia flag
Sorry for misunderstanding, but by
Noted same issue with Jetson Orin AGX but this PR fixes the problem.
I meant changes in #211. --user-preserve-groups
doesn't fix my problem - it adds few groups from host but not the desired one (video
). Another example: connected IMU with vendor's udev rules required dialout
group. group-add dialout
works perfect. --user-preserve-groups
didn't pass dialout
, thus I need add group in docker by hand and then enter into container in new terminal window (otherwise groups won't refresh) and only then run sensor nodes. Would you approve of these 2 PRs?
Oh, sorry I thought that you'd said that #212 fixed your solution too. So in addition to preserving the groups. If you mount a device into the container it my get mounted with certain access that's required. Or provide permissions that you don't have on the host. (ala sudo)
We should figure out how to make sure that this and #212 can work together then. So they don't conflict depending on ordering. And secondly there's actually already a force of sudo in the users dir which should be evaluated for consistency.
user-preserve-groups
adds only groups dynamically allocated (GID > 99), therefore doc seems not true. I've merged #211 and #212 and exploited in many possible ways - different order or overriding groups from both solutions and I didn't see any conflict.
I've gone ahead and merged the group elements in #222 could you rebase and validate the nvidia changes here?
I've gone ahead and merged the group elements in #222 could you rebase and validate the nvidia changes here?
Changed and tested, everything works.
@tfoote , FYI idk why but current --user-preserve-groups
doesn't work as expected. I took main branch without any changes and there is a bug:
--user-preserve-groups
& without --user
-> OK.--user-preserve-groups
& with --user
-> ERROR.
Traceback (most recent call last):
File "/home/amadeusz/.local/bin/rocker", line 8, in <module>
sys.exit(main())
File "/home/amadeusz/.local/lib/python3.8/site-packages/rocker/cli.py", line 64, in main
dig = DockerImageGenerator(active_extensions, args_dict, base_image)
File "/home/amadeusz/.local/lib/python3.8/site-packages/rocker/core.py", line 209, in __init__
self.dockerfile = generate_dockerfile(active_extensions, self.cliargs, base_image)
File "/home/amadeusz/.local/lib/python3.8/site-packages/rocker/core.py", line 348, in generate_dockerfile
dockerfile_str += el.get_snippet(args_dict) + '\n'
File "/home/amadeusz/.local/lib/python3.8/site-packages/rocker/extensions.py", line 279, in get_snippet
matched_groups = [g for g in all_groups if g.gr_name in cliargs['user_preserve_groups']]
File "/home/amadeusz/.local/lib/python3.8/site-packages/rocker/extensions.py", line 279, in <listcomp>
matched_groups = [g for g in all_groups if g.gr_name in cliargs['user_preserve_groups']]
TypeError: argument of type 'bool' is not iterable
--user-preserve-groups
& without --user
-> OK, but groups are not shared, is it expected?--user-preserve-groups
& with --user
-> OK.It is not connected with my changes, just saying.
Thanks for flagging that issue with the merged one. This looks good to go on top, but it got caught with the dependency issues in #225. Could you rebase once more and I think we should be good to merge?
Could you rebase once more and I think we should be good to merge?
Done.
It should be pretty straight forward to add a few more cases with extra mocked cliargs like here
Done.
It is not possible to run container on Nvidia Jetson Xavier NX (Jetpack 5.0.2, Docker 20.10.21)
Executing command: docker run --rm -it --gpus all -v /home/apm/autoware:/home/apm/autoware -e DISPLAY -e TERM -e QT_X11_NO_MITSHM=1 -e XAUTHORITY=/tmp/.dockerzb1vil5u.xauth -v /tmp/.dockerzb1vil5u.xauth:/tmp/.dockerzb1vil5u.xauth -v /tmp/.X11-unix:/tmp/.X11-unix -v /etc/localtime:/etc/localtime:ro 7812f082195f docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'csv' invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime instead.: unknown.
Tiny change in
nvidia_extension.py
fixes the problem. However, execution of any window app (e.g. rviz2) causesSegmentation fault
error. Adding user tovideo
group fixes the problem.With these changes I ran cuda test inside the docker with different rocker settings:
--group-add video
only--nvidia
only--nvidia
and--group-add video
I'm not sure if gpu fix is the best idea, but what about group-add plugin in general? Is there any other way to reach that functionality? As I remember, several times my sensor needed group permission (e.g. dialout), thus I had to do it manually in terminal and attach container to new bash terminal again to refresh groups.