Open MoritzLaurer opened 1 month ago
Did some more testing and can confirm that converting the JSON schema to a regex before passing it to TGI seems to solve the ordering issue:
# convert JSON schema to regex
import json
from outlines.fsm.json_schema import build_regex_from_schema
schema_string = json.dumps(OutputSchema.schema(), indent=2)
schema_regex = build_regex_from_schema(schema_string)
generation_params = dict(
# add grammar / JSON schema
grammar={
"type": "regex", #"json",
"value": schema_regex, #OutputSchema.schema(),
},
# these parameters further help guide the token generation process
top_p=0.80,
top_k=None,
temperature=0.6,
repetition_penalty=2.0, # repetition penalty is helpful to avoid that the model gets stuck in generating the same token
do_sample=True,
max_new_tokens=512,
return_full_text=False,
seed=42,
max_time=None,
stream=False,
details=False,
use_cache=False,
wait_for_model=False,
)
...
# endpoint API produces output in correct ordering
[{'generated_text': '{"reasoning":"In order to confirm whether there\'s any indication of diabetic conditions in these two documents from KENT BROWN & WILLIAMSON TOBACCO CORP., I would need more specific information about their content and context.", \n "contains_diagnosis_diabetes":"No"}'}]
Nice stuff @MoritzLaurer! Hope you're doing great man!
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
System Info
Tests run via dedicated endpoints and Idefics2. TGI version was probably 2.0.2
Information
Tasks
Reproduction
The following prompt with grammar returns JSON with keys in a different order than the Pydantic schema. The correct ordering is important for chain-of-thought prompts to work properly.
See this internal discussion for context. The issue seems to come from serialization/deserialization steps throughout the pipeline, which don't enforce ordering.
Reproduce with an
idefics2-8b-chatty
model on a dedicated endpoint and a grammar:The output provides
contains_diagnosis_diabetes
first and then thereasoning
, which makes CoT useless.Expected behavior
The grammar should enforce the exact ordering of the JSON keys/Pydantic arguments.
@drbh explained: "Regarding TGI we do have a couple serialization/deserialization before the value is converted to a fsm, so its likely in one of those steps the order is not preserved. JSON by design doesn't guarantee ordering but python dictionaries do preserve order, so if you avoid serializing to JSON the ordering is preserved, unfortunately we rely on sending the grammar as JSON over HTTP and internally over GRPC, therefore ordering is not guaranteed. In order to ensure the ordering we'd need to capture the output regex that to_regex produces and send that as the grammar, or some other regex grammar."
This issue is not urgent but would be relevant to have a solution in the medium term.