ray-project / ray-llm

RayLLM - LLMs on Ray
https://aviary.anyscale.com
Apache License 2.0
1.22k stars 87 forks source link

No available node types can fulfill resource request defaultdict - error on local deployment #101

Open NikolayTV opened 8 months ago

NikolayTV commented 8 months ago

Hi i did not find any instructions on how to add nodes. I run it on local machine.

What I do:

device5=GPU-1a51db55-37db-141f-5b83-950cc3f31d68 cache_dir=/home/workspace/ray-llm-master

docker run -it \ --gpus device=$device5 \ --cpus=20 \ --memory=8g \ --shm-size 1g \ -p 8778:8000 \ -e HF_HOME=/home/ray/data \ -v $cache_dir:/home/ray/ray-llm-master \ anyscale/ray-llm:latest bash

Inside docker container

ray start --head serve run ray-llm-master/serve_configs/amazon--LightGPT.yaml

image
matthiasmfr commented 8 months ago

just start your ray cluster with smth like

ray start --head --dashboard-host=0.0.0.0 --num-cpus 12 --num-gpus 1 --resources '{\"accelerator_type_a40\":1, \"accelerator_type_a100_40g\":1}'

to simulate the custom resources. its just a placeholder, u can still use other GPU types