Closed relyt0925 closed 1 month ago
Currently: the ilab wrapper script here: https://github.com/containers/ai-lab-recipes/blob/main/training/ilab-wrapper/ilab does not update the shm size to be 10 Gigs. It is noted in requirements for running deepspeed and vllm that a shm size of 10GB is necessary to run full scale model inferencing and training.
A related issue: https://github.com/vllm-project/vllm/issues/1710 and just to note: the earlier scripts that launched vllm directly had a shm size of 10GB
Currently: the ilab wrapper script here: https://github.com/containers/ai-lab-recipes/blob/main/training/ilab-wrapper/ilab does not update the shm size to be 10 Gigs. It is noted in requirements for running deepspeed and vllm that a shm size of 10GB is necessary to run full scale model inferencing and training.
A related issue: https://github.com/vllm-project/vllm/issues/1710 and just to note: the earlier scripts that launched vllm directly had a shm size of 10GB