[Inference] Integrate vllm example

intel / llm-on-ray

Pretrain, finetune and serve LLMs on Intel platforms with Ray

Apache License 2.0

103 stars 30 forks source link

[Inference] Integrate vllm example #262

Closed KepingYan closed 3 months ago

carsonwang commented 4 months ago

Thanks for the work! Can you also update all model yamls so they will use vllm by default unless one is not supported. Remove IPEX and Deepspeed related configs from the yaml and disable them by default.

KepingYan commented 3 months ago

Gently ping @xwu99 , review comments are all resolved.