Hi, im trying to test the new tensorrt-llm support using the default format given by the repo. However, both llama 7b and llama 70b example are failing with this error.
(ServeController pid=910) WARNING 2024-01-22 10:33:24,218 controller 910 application_state.py:742 - Deploying app 'ray-llm' failed with exception:
(ServeController pid=910) Traceback (most recent call last):
(ServeController pid=910) File "pydantic/main.py", line 522, in pydantic.main.BaseModel.parse_obj
(ServeController pid=910) ValueError: dictionary update sequence element #0 has length 1; 2 is required
(ServeController pid=910)
(ServeController pid=910) The above exception was the direct cause of the following exception:
(ServeController pid=910)
(ServeController pid=910) Traceback (most recent call last):
(ServeController pid=910) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/utils.py", line 65, in parse_args
(ServeController pid=910) parsed_models = [llm_app_cls.parse_yaml(raw_model)]
(ServeController pid=910) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/models.py", line 106, in parse_yaml
(ServeController pid=910) return cls.parse_obj(dict_args)
(ServeController pid=910) File "pydantic/main.py", line 525, in pydantic.main.BaseModel.parse_obj
(ServeController pid=910) pydantic.error_wrappers.ValidationError: 1 validation error for TRTLLMApp
(ServeController pid=910) __root__
(ServeController pid=910) TRTLLMApp expected dict not str (type=type_error)
(ServeController pid=910)
(ServeController pid=910) The above exception was the direct cause of the following exception:
(ServeController pid=910)
(ServeController pid=910) Traceback (most recent call last):
(ServeController pid=910) File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/application_state.py", line 994, in build_serve_application
(ServeController pid=910) app = call_app_builder_with_args_if_necessary(import_attr(import_path), args)
(ServeController pid=910) File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/api.py", line 348, in call_app_builder_with_args_if_necessary
(ServeController pid=910) app = builder(args)
(ServeController pid=910) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/run.py", line 194, in router_application
(ServeController pid=910) trtllm_apps = parse_args(router_args.trtllm_models, llm_app_cls=TRTLLMApp)
(ServeController pid=910) File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/utils.py", line 68, in parse_args
(ServeController pid=910) raise ValueError(
(ServeController pid=910) ValueError: Could not parse string as yaml. If you are specifying a path, make sure it exists and can be reached.
(ServeController pid=910)
(build_serve_application pid=588) [ip-10-0-179-57:00588] [[42342,1],0] ORTE_ERROR_LOG: Unreachable in file runtime/ompi_mpi_finalize.c at line 262
Hi, im trying to test the new tensorrt-llm support using the default format given by the repo. However, both llama 7b and llama 70b example are failing with this error.