ray-project / ray-llm

RayLLM - LLMs on Ray
https://aviary.anyscale.com
Apache License 2.0
1.22k stars 91 forks source link

Error when trying to run tensorrt model on ray #124

Closed rifkybujana closed 8 months ago

rifkybujana commented 8 months ago

Hi, im trying to test the new tensorrt-llm support using the default format given by the repo. However, both llama 7b and llama 70b example are failing with this error.

(ServeController pid=910) WARNING 2024-01-22 10:33:24,218 controller 910 application_state.py:742 - Deploying app 'ray-llm' failed with exception:
(ServeController pid=910) Traceback (most recent call last):
(ServeController pid=910)   File "pydantic/main.py", line 522, in pydantic.main.BaseModel.parse_obj
(ServeController pid=910) ValueError: dictionary update sequence element #0 has length 1; 2 is required
(ServeController pid=910) 
(ServeController pid=910) The above exception was the direct cause of the following exception:
(ServeController pid=910) 
(ServeController pid=910) Traceback (most recent call last):
(ServeController pid=910)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/utils.py", line 65, in parse_args
(ServeController pid=910)     parsed_models = [llm_app_cls.parse_yaml(raw_model)]
(ServeController pid=910)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/models.py", line 106, in parse_yaml
(ServeController pid=910)     return cls.parse_obj(dict_args)
(ServeController pid=910)   File "pydantic/main.py", line 525, in pydantic.main.BaseModel.parse_obj
(ServeController pid=910) pydantic.error_wrappers.ValidationError: 1 validation error for TRTLLMApp
(ServeController pid=910) __root__
(ServeController pid=910)   TRTLLMApp expected dict not str (type=type_error)
(ServeController pid=910) 
(ServeController pid=910) The above exception was the direct cause of the following exception:
(ServeController pid=910) 
(ServeController pid=910) Traceback (most recent call last):
(ServeController pid=910)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/application_state.py", line 994, in build_serve_application
(ServeController pid=910)     app = call_app_builder_with_args_if_necessary(import_attr(import_path), args)
(ServeController pid=910)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/api.py", line 348, in call_app_builder_with_args_if_necessary
(ServeController pid=910)     app = builder(args)
(ServeController pid=910)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/run.py", line 194, in router_application
(ServeController pid=910)     trtllm_apps = parse_args(router_args.trtllm_models, llm_app_cls=TRTLLMApp)
(ServeController pid=910)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/utils.py", line 68, in parse_args
(ServeController pid=910)     raise ValueError(
(ServeController pid=910) ValueError: Could not parse string as yaml. If you are specifying a path, make sure it exists and can be reached.
(ServeController pid=910) 
(build_serve_application pid=588) [ip-10-0-179-57:00588] [[42342,1],0] ORTE_ERROR_LOG: Unreachable in file runtime/ompi_mpi_finalize.c at line 262
rifkybujana commented 8 months ago

My bad, i should use the version 0.5.0 of the docker instead of the latest.