I frequently find myself wondering what is the difference between the latest version(s) of vLLM, and the version that I currently have deployed. It seems like it would be nice to keep a simple changelog that documents features, fixes, newly-supported hardware and updates between versions so that we can easily see what has been added in recent versions -- e.g. new CLI arguments, optimizations, quantization formats, updated hardware support for features (e.g. punica -> triton kernels, expanding hardware support for multi-lora serving) patched bugs, and so forth.
A great template for this is Keep a Changelog following semver - it would be super easy to implement with Markdown in the documentation site. I think this would make vLLM's newer features much more accessible, and it would also help identify gaps in the documentation when we add something to the changelog that's not on the docs site
FAQs
There are a bunch of questions that are commonly asked over and over in the discord, including things such as:
Does vLLM support XYZ hardware/accelerator?
Does vLLM support tool use / when is tool use coming?
Does vLLM support XYZ model / model architecture?
How can I get my model to fit in a vRAM-constrained environment?
How do I get started with distributed inference?
It seems like it would be nice to have a list of rolling FAQs to refer people to, and maybe to pin in the discord, for quick reference. This could be tracked in version control in the docs site, so that we can easily make pertinent additions and deletions as necessary?
Proposed Change.
Create a simple changelog in the docs site following the Keep a Changelog structure;
a. Ask PR contributors to update this along with docs when they create a PR OR update this as part of the release process
Implement a simple FAQ in the docs site
Feedback Period.
1 week?
CC List.
@mgoin @simon-mo @WoosukKwon @petersalas @comaniac @SolitaryThinker @ywang96 @DarkLight1337 unsure who else to ask - I just pulled a list of recent contributors to the docs
Motivation.
Changelog
I frequently find myself wondering what is the difference between the latest version(s) of vLLM, and the version that I currently have deployed. It seems like it would be nice to keep a simple changelog that documents features, fixes, newly-supported hardware and updates between versions so that we can easily see what has been added in recent versions -- e.g. new CLI arguments, optimizations, quantization formats, updated hardware support for features (e.g. punica -> triton kernels, expanding hardware support for multi-lora serving) patched bugs, and so forth.
A great template for this is Keep a Changelog following semver - it would be super easy to implement with Markdown in the documentation site. I think this would make vLLM's newer features much more accessible, and it would also help identify gaps in the documentation when we add something to the changelog that's not on the docs site
FAQs
There are a bunch of questions that are commonly asked over and over in the discord, including things such as:
It seems like it would be nice to have a list of rolling FAQs to refer people to, and maybe to pin in the discord, for quick reference. This could be tracked in version control in the docs site, so that we can easily make pertinent additions and deletions as necessary?
Proposed Change.
Feedback Period.
1 week?
CC List.
@mgoin @simon-mo @WoosukKwon @petersalas @comaniac @SolitaryThinker @ywang96 @DarkLight1337 unsure who else to ask - I just pulled a list of recent contributors to the docs
Any Other Things.
No response