deepjavalibrary / djl-serving

A universal scalable machine learning model deployment solution
Apache License 2.0
183 stars 58 forks source link

Refactor lmi_dist and vllm to support best_of with RequestOutput #2011

Closed sindhuvahinis closed 1 month ago

sindhuvahinis commented 1 month ago

Description

Brief description of what this PR is about

sindhuvahinis commented 1 month ago

Will run the LMI integration tests and then merge it, as it affects LMI, VLLM and NeuronVLLM.

sindhuvahinis commented 1 month ago

Ran integration tests https://github.com/deepjavalibrary/djl-serving/actions/runs/9407303778 - Gemma failed. But running it in my EC2 machine works though. Remaining models other than gemma also passed. https://github.com/deepjavalibrary/djl-serving/actions/runs/9409733042/job/25920234322