vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
27.94k stars 4.13k forks source link

[Feature]: Support for Mirostat, Dynamic Temperature, and Quadratic Sampling #5209

Open Emmie411 opened 4 months ago

Emmie411 commented 4 months ago

🚀 The feature, motivation and pitch

Would it be possible to add support for:

  1. Mirostat: Adaptive sampling to maintain a target perplexity, ensuring consistent generation quality by adapting sampling strategies in real-time.
  2. Dynamic Temperature: Adjusting temperature dynamically based on certain criteria, allowing the model to adjust its creativity and coherence dynamically, based on the context or user input.
  3. Quadratic Sampling: An alternative sampling method to improve diversity and quality of the outputs, providing more nuanced and diverse text generation, improving overall user experience.

These features would enhance the model’s flexibility and output quality, making it more versatile and effective for a wider range of applications.

If these aren’t supported yet, is it possible to include them in the upcoming roadmap?

Alternatives

No response

Additional context

No response

dipatidar commented 4 months ago

I'd love to take this one. Any notes or hints would be appreciated.

khan-yin commented 3 months ago

hi, I am a student also interested in the feature request, could you provide some related works to follow?😂