noamgat / lm-format-enforcer

Enforce the output format (JSON Schema, Regex etc) of a language model
MIT License
1.53k stars 68 forks source link

[Bug] exllamav2 integration outputs eos_token_id nonstop until max_new_token for json schema #134

Open Dan-wanna-M opened 2 months ago

Dan-wanna-M commented 2 months ago

example output: ' {\n"street": "Maple Avenue",\n"city": "Boston",\n"state": "Massachusetts",\n"country": "USA",\n"postal_code": "02139"\n} \n\n \r\n\n \r<|end_of_text|><|end_of_text|><|end_of_text|><|end_of_text|><|end_of_text|>'

To replicate, set decode_special_tokens=True in the generator.generate() method for some json schemas.

noamgat commented 2 months ago

Can you submit a full notebook that reproduces the issue?