noamgat / lm-format-enforcer

Enforce the output format (JSON Schema, Regex etc) of a language model
MIT License
994 stars 45 forks source link

fix: check that last_non_whitespace_character is not an empty string #105

Closed NJordan72 closed 1 month ago

NJordan72 commented 1 month ago

Fixes the problem where JSON Schema could end up with leading commas. The problem was that last_non_whitespace_character is set to an empty string if there has not been a non whitespace character, but the check that increments num_items didn't handle that case.

Dealign with the empty string is a bit unsatisfying, but I didn't feel comfortable with all of the implications of refactoring it other than to add the check.

Resolves #99

noamgat commented 1 month ago

Thank you for finding this! last_non_whitespace_character should never be an empty string once the parsing encountered one non whitespace character, so I solved it in the correct way: https://github.com/noamgat/lm-format-enforcer/commit/5461294f5cb6ee5eae677c75358f4b83353ee4db