turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.53k stars 272 forks source link

Fix a couple of filter bugs #354

Closed seanlynch closed 7 months ago

seanlynch commented 7 months ago

Ran into two bugs trying to use the Select filter:

  1. The sampler passes "None" as the prefix argument to begin when there's no healed token. This was causing an exception because Select expects the prefix to be a string.
  2. The generator was ignoring the end_filter return value from the sampler, causing it to keep generating even after matching an end token from the Select filter. The result is that the Select filter returns an empty pass_tokens set on the next call to next, causing an exception.

This fixes the first by setting the prefix to the empty string if it's None, and the second by using the end_filter return value from the sampler as the default value for the eos variable in the generator, which tells the generator it's hit the eos token.

Only tested with Select and generate_simple. The Prefix filter never sets end tokens, so the main thing that might break is lm-format-enforcer if it's using end_tokens incorrectly.

turboderp commented 7 months ago

Thanks. :+1: