[doc] Cannot deploy an LLM model on EKS with KubeRay

enori commented 10 months ago

I deployed on avitary head/worker pod on EKS cluster using KubeRay and tried to deploy the LLM model due to an error in the following command.

serve run serve/meta-llama--Llama-2-7b-chat-hf.yaml

However, I couldn't deploy it.

I think the problem is compatibility with Python packages. Is there a requirements.txt file (e.g. pydantic) with the appropriate package versions?

The following is the commands I ran and the output.

$ kubectl exec -it aviary-head-vjlb4 -- bash

(base) ray@aviary-head-vjlb4:~$ pwd
/home/ray

(base) ray@aviary-head-vjlb4:~$ export HUGGING_FACE_HUB_TOKEN=${MY_HUGGUNG_FACE_HUB_TOKEN}

(base) ray@aviary-head-vjlb4:~$ serve run serve/meta-llama--Llama-2-7b-chat-hf.yaml
2023-10-27 00:34:01,394 INFO scripts.py:471 -- Running import path: 'serve/meta-llama--Llama-2-7b-chat-hf.yaml'.
Traceback (most recent call last):
  File "/home/ray/anaconda3/bin/serve", line 8, in <module>
    sys.exit(cli())
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/scripts.py", line 473, in run
    import_attr(import_path), args_dict
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/_private/utils.py", line 1378, in import_attr
    module = importlib.import_module(module_name)
  File "/home/ray/anaconda3/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'serve/meta-llama--Llama-2-7b-chat-hf'

(base) ray@aviary-head-vjlb4:~$ serve run models/continuous_batching/meta-llama--Llama-2-7b-chat-hf.yaml
2023-10-27 00:47:36,307 INFO scripts.py:418 -- Running config file: 'models/continuous_batching/meta-llama--Llama-2-7b-chat-hf.yaml'.
Traceback (most recent call last):
  File "/home/ray/anaconda3/bin/serve", line 8, in <module>
    sys.exit(cli())
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/scripts.py", line 462, in run
    raise v1_err from None
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/scripts.py", line 449, in run
    config = ServeApplicationSchema.parse_obj(config_dict)
  File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for ServeApplicationSchema
import_path
  field required (type=value_error.missing)

Thank you for checking.

akshay-anyscale commented 10 months ago

try using serve run serve_configs/meta-llama--Llama-2-7b-chat-hf.yaml . I'll fix the docs to reflect that

akshay-anyscale commented 10 months ago

Docs fixed here https://github.com/ray-project/ray-llm/pull/85

enori commented 10 months ago

Thank you! I used anyscale/ray-llm as the docker image, as mentioned in another Issue

The result is changed. Ray Serve status is DEPLOY_FAILED, the message is The deployments ['VLLMDeployment:meta-llama--Llama-2-7b-chat-hf'] are UNHEALTHY.

Moreover, the container I deployed does not have a serve_configs directory, so I had to create a new one. I think it is better to add the serve_configs directory in the docker image or mention the need for a copy of serve_configs file in the documentation.

I will commit the changes to the documentation later from here as well.

ray-project / ray-llm

[doc] Cannot deploy an LLM model on EKS with KubeRay #80