Howdy,
I am testing the anyscale/ray-llm docker container on a host with four A100 GPUs.
When trying to deploy the codellama model (models/continuous_batching/codellama--CodeLlama-34b-Instruct-hf.yaml) it keeps complaining that Error: No available node types can fulfill resource request defaultdict(<class 'float'>, {'accelerator_type_a100_80g': 0.02, 'CPU': 9.0, 'GPU': 1.0}). Add suitable node types to this cluster to resolve this issue.
When checking the ray status I do see that the four GPUs are detected but I dont see any accelerator resource. Is this the problem?
The container is started as described in your README: docker run -it --gpus all --shm-size 1g -p 8000:8000 -e HF_HOME=~/data -v $cache_dir:~/data anyscale/ray-llm:latest bash
Howdy, I am testing the
anyscale/ray-llm
docker container on a host with four A100 GPUs. When trying to deploy the codellama model (models/continuous_batching/codellama--CodeLlama-34b-Instruct-hf.yaml
) it keeps complaining thatError: No available node types can fulfill resource request defaultdict(<class 'float'>, {'accelerator_type_a100_80g': 0.02, 'CPU': 9.0, 'GPU': 1.0}). Add suitable node types to this cluster to resolve this issue.
When checking the
ray status
I do see that the four GPUs are detected but I dont see any accelerator resource. Is this the problem?cuda
andnvidia-smi
correctly shows the cards within the container:The container is started as described in your README:
docker run -it --gpus all --shm-size 1g -p 8000:8000 -e HF_HOME=~/data -v $cache_dir:~/data anyscale/ray-llm:latest bash