Integrate vllm and inference engine (neural speed)

intel / llm-on-ray

Pretrain, finetune and serve LLMs on Intel platforms with Ray

Apache License 2.0

103 stars 30 forks source link

Closed jiafuzha closed 4 months ago

jiafuzha commented 6 months ago

Please help review

jiafuzha commented 4 months ago

Will create new PR, close this one.