Backend.AI is a streamlined, container-based computing cluster platform that hosts popular computing/ML frameworks and diverse programming languages, with pluggable heterogeneous accelerator support including CUDA GPU, ROCm GPU, TPU, IPU and other NPUs.
There are several customer requests for supporting "freezing" their current compute sessions.
Historically, we intentionally have not added this feature due to inherent volatility and security restrictions of Docker containers. The filesystem that users see in the sessions is a composition of multiple local/remote filesystems, such as the container image, vfolders, and scratch directories. For security, we don't expose the root user inside the container by default, so there is usually no way to modify the container filesystem provided by the image because all system packages and directories are owned by root.
To allow installation of additional packages using user-site paths in Python (pip install --user) and make it persistent across different compute sessions, we have added the following features:
98
99
auto-mounting dot-prefixed vfolders (such as ~/.config, ~/.local)
Nevertheless, many HPC/AI customers want to use the containers like VMs, and it is not easy to fill the conceptual gap between the volatile & hermetic nature of containers and the full ownership of volume data of VMs.
So, despite whatever additional cautions required when committing a Backend.AI compute session, let's make it technically available.
There are several customer requests for supporting "freezing" their current compute sessions.
Historically, we intentionally have not added this feature due to inherent volatility and security restrictions of Docker containers. The filesystem that users see in the sessions is a composition of multiple local/remote filesystems, such as the container image, vfolders, and scratch directories. For security, we don't expose the root user inside the container by default, so there is usually no way to modify the container filesystem provided by the image because all system packages and directories are owned by root.
To allow installation of additional packages using user-site paths in Python (
pip install --user
) and make it persistent across different compute sessions, we have added the following features:98
99
~/.config
,~/.local
)Nevertheless, many HPC/AI customers want to use the containers like VMs, and it is not easy to fill the conceptual gap between the volatile & hermetic nature of containers and the full ownership of volume data of VMs.
So, despite whatever additional cautions required when committing a Backend.AI compute session, let's make it technically available.