turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.53k stars 273 forks source link

Fix tuple returns with the streaming generator #331

Closed bdashore3 closed 7 months ago

bdashore3 commented 7 months ago

Rather than appending, adding fallbacks makes the API much easier to manage when working with conditionally returned values.