scaleapi / llm-engine

Scale LLM Engine public repository
https://llm-engine.scale.com
Apache License 2.0
770 stars 50 forks source link

Allow support for vllm batch with checkpoints #591

Closed dmchoiboi closed 1 month ago

dmchoiboi commented 1 month ago

Pull Request Summary

What is this PR changing? Why is this change being made? Any caveats you'd like to highlight? Link any relevant documents, links, or screenshots here if applicable.

Remove the requirement to add models to model zoo in order to run batch inference via vllm as long as a checkpoint is provided

Test Plan and Usage Guide

How did you validate that your PR works correctly? How do you run or demo the code? Provide enough detail so a reviewer can reasonably reproduce the testing procedure. Paste example command line invocations if applicable.