Backend.AI is a streamlined, container-based computing cluster platform that hosts popular computing/ML frameworks and diverse programming languages, with pluggable heterogeneous accelerator support including CUDA GPU, ROCm GPU, TPU, IPU and other NPUs.
There are cases where accelerator driver requires Linux user to join on specific groups (e.g. ROCm). To support such cases we can think of extending current accelerator plugin architecture to expose extra set of gids which container user (work user) will be joined at.
Alternative ideas
No response
Anything else?
This issue can work as a keystone job of #2592, as the implementation of this feature will allow way to make container user join extra groups other than default GID and 44 (shadow).
We could add a new environment variable containing the list of additional GIDs to be passed to su-exec via entrypoint.sh like LOCAL_USER_ID and LOCAL_GROUP_ID.
Main idea
There are cases where accelerator driver requires Linux user to join on specific groups (e.g. ROCm). To support such cases we can think of extending current accelerator plugin architecture to expose extra set of gids which container user (
work
user) will be joined at.Alternative ideas
No response
Anything else?
This issue can work as a keystone job of #2592, as the implementation of this feature will allow way to make container user join extra groups other than default GID and
44
(shadow
).