Closed AshinWu closed 1 month ago
To address this issue, the environment variable is lost due to SSH. To resolve this, we can append the following line to the /etc/profile
file: export $(cat /proc/1/environ |tr '\0' '\n' | xargs)
. This will retrieve the environment variable from process 1 and set it to the container. Additionally, SSH connecting can be automatically executed by adding source /etc/profile
.
For example, when I set the limit for my pod with
volcano.sh/vgpu-memory: '1024'
, and then enter the container usingkubectl exec
, executingnvidia-smi
shows that the GPU memory is indeed 1024. The environment variableCURA_DEVICE_MEMORY_LIMIT_0=1024m
is set correctly.However, when I connect to the container via SSH and execute
nvidia-smi
, the GPU memory shows the full capacity and is not controlled byvgpu-memory: '1024'
. There is also noCURA_DEVICE_MEMORY_LIMIT_0
variable in the environment. What is going on?