noamgat / lm-format-enforcer

Enforce the output format (JSON Schema, Regex etc) of a language model
MIT License
1.01k stars 46 forks source link

Empty list cannot be closed after a newline #63

Closed AriX closed 5 months ago

AriX commented 5 months ago

Thanks for your awesome work on this project!

I'm seeing an issue with JSON like the following:

{
  "num" : 1,
  "list_of_strings" : [
  ]
}

In particular, if a newline is generated after the array's opening [, lm-format-enforcer will not allow the list to be closed with a ]. It appears that:

This can be verified by adding this test to test_jsonschemaparser.py:

def test_empty_list_with_newline():
    class EmptyListOKModel(BaseModel):
        num: int
        list_of_strings: Optional[List[str]] = Field(None, min_length=0, max_length=1)

    no_strings = '{"num":1,"list_of_strings":[\n]}'
    _test_json_schema_parsing_with_string(no_strings, EmptyListOKModel.model_json_schema(), True)

I'm not sure what the best solution here is, but some ideas I have are:

Any input on what solution would be most idiomatic would be greatly appreciated.

noamgat commented 5 months ago

Thanks for the report! I hope to fix this in the near future.

noamgat commented 5 months ago

Solved in v0.8.3