Open wired-mind opened 1 year ago
Aviary requires a Ray Cluster to run. You can set up an on-premise Ray Cluster (https://docs.ray.io/en/latest/cluster/vms/user-guides/launching-clusters/on-premises.html). Because Aviary uses Ray Custom Resources to ensure that each model is scheduled on an intended GPU type, you will need to set those in both the Ray cluster configuration and Aviary model yamls.
You can edit the EC2 config to use on-prem instead with your desired node type.
Alternatively, if you just want to experiment, you can do the following:
pip install -e ".[backend, frontend]"
scaling_config
section in model configuration and change the accelerator_type_[TYPE]
to accelerator_type_a100
ray start --resources "{\"accelerator_type_a100\": 1}"
(the actual number of GPUs will be detected automatically)aviary run --model model_yaml_with_edited_scaling_config.yaml
This will start a Ray cluster composed of just this single node.
Perfect, thank you. Got it all working along with the frontend in a docker container. One problem I encountered was that both the frontend and backend default to port 8000, so the front end needed to be started like this: serve run --host 0.0.0.0 --port 7860 aviary.frontend.app:app
@Yard1 what do you think about making the frontend run on port 7860 by default to be consistent with normal Gradio and not cause this problem?
I think that's a good idea!
I would like to run a single machine that is on-premise, but not able to get the models to load as it is looking for actor/worker resource nodes that don't exist. Do you have any config example for single machine on-premise?