turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.54k stars 274 forks source link

feat: frequency and presence penalty #241

Closed AlpinDale closed 9 months ago

AlpinDale commented 9 months ago

The implementation is based on the OpenAI specification.