ray-project / ray-llm

RayLLM - LLMs on Ray
https://aviary.anyscale.com
Apache License 2.0
1.22k stars 89 forks source link

how to configure if we have low spec? #24

Open okyx opened 1 year ago

okyx commented 1 year ago

the log keep saying "has 1 replicas that have taken more than 30s to initialize"

okyx commented 1 year ago

has 1 replicas that have taken more than 30s to be scheduled. This may be caused by waiting for the cluster to auto-scale, or waiting for a runtime environment to install. Resources required for each replica: {"accelerator_type_cpu": 0.01, "CPU": 1}, resources available: {"CPU": 15.0}

Yard1 commented 1 year ago

Aviary uses Ray custom resources (eg. accelerator_type_cpu) for scheduling. It appears that the cluster you are running Aviary on doesn't have them. You can either configure them to be visible in your cluster, or remove them from the model configuration YAMLs.

okyx commented 1 year ago

thanks for the answer @Yard1 , but i tried to reconfigure image the log change into this amazon--LightGPT_amazon--LightGPT has 1 replicas that have taken more than 30s to initialize. This may be caused by a slow init or reconfigure method

okyx commented 1 year ago

is it okay if my cluster doesnt have gpu?

Yard1 commented 1 year ago

Most of the models require GPUs. You may try to use llama.cpp backend with CPU. See https://github.com/ray-project/aviary/blob/master/models/static_batching/eachadea--ggml-vicuna-13b-1.1.yaml for an example. Make sure to remove custom resources (two instances of accelerator_type_cpu).