OE4T / meta-tegra

BSP layer for NVIDIA Jetson platforms, based on L4T
MIT License
405 stars 222 forks source link

Update container support to support 'auto' mode #1541

Closed madisongh closed 4 months ago

madisongh commented 5 months ago
dchvs commented 5 months ago

@madisongh, I encountered this error while testing this feature:

nvidia-container-cli
nvidia-container-cli: Error: libnvidia-container.so.0: cannot open shared object file: No such file or directory

The error seems to be related to this libnvidia-container.so.0 reference found here: https://github.com/NVIDIA/libnvidia-container/blob/v1.15.0/src/cli/libnvc.c#L33

madisongh commented 5 months ago

Hmm, I didn't see that in my testing, but it sure looks like that it should have. I'll go back and re-check.

madisongh commented 5 months ago

I've tested again, and I'm not seeing that error unless I manually change the config.toml file back to using mode = "legacy". With mode = "auto" (which is now the default with this patch series), I've successfully fired up the l4t-jetpack:r35.4.1 container from NGC on my Orin Nano using a build off master plus these patches.

madisongh commented 5 months ago

I've refreshed the PR to put back the libnvidia-container-jetson recipe, just in case there are folks that really need to use 'legacy' mode for running their containers. The config.toml file still defaults to mode = "auto", though, which avoids the use of the legacy-mode prestart hook (which will disappear completely in JetPack 6).

The libtirpc126 recipe is still removed, as that workaround is no longer needed.

Just be warned that if you do need 'legacy' mode, you must be using systemd earlier than v252, or you must add systemd.unified_cgroup_hierarchy=off to your kernel command line, so that the cgroups support is compatible with the older v1 ABI expected by the legacy-mode library.

dchvs commented 5 months ago

I can confirm that it's working in both auto and legacy modes.

An additional factor to consider in "legacy" is when having the flag --privileged=true and/or mounting /dev as a volume with -v /dev/:/dev/. In such cases, it becomes essential to modify /etc/nvidia-container-runtime/host-files-for-container.d/l4t.csv by excluding the dev, /dev/... mappings. Now, it's worth to mention that this adjustment isn't needed when operating in auto mode.