Closed elonen closed 6 months ago
Yes, length-constrained strings can not take advantage of the "json freetext token" shortcut so they use the full mode. It is possible to develop a length-specific caching to make it faster, I'll leave this issue open for people to vote on it to see how much demand there is for it.
~Hmm, actually, this seems to work:~
# print("Filtering whitespace characters")
allowed_characters = "".join(c for c in allowed_characters if c not in WHITESPACE_CHARACTERS)
@@ -102,11 +102,12 @@ class JsonSchemaParser(CharacterLevelParser):
current_parser = self.object_stack[-1]
if isinstance(current_parser, StringParsingState):
if not current_parser.allowed_strings and current_parser.seen_opening_quote and not current_parser.seen_closing_quote \
- and current_parser.min_length is None and current_parser.max_length is None:
+ and current_parser.min_length is None:
# Performance optimization: When we are parsing a string that is not from a list of allowed strings, most tokens
# are legal. The exploration can be more costly than the LM itself for large tokenizers (because this is pure python),
# so we signal that we are in a "freetext" mode, and reuse the allowed token list throughout the run.
- return 'json_freetext'
+ if current_parser.max_length is None or len(current_parser.parsed_string) < current_parser.max_length:
+ return 'json_freetext'
return None
Answer, with plain str:
{
"preamble": "Knock knock!",
"question": "Who's there?",
"name": "Banana",
"who": "Banana who?",
"punchline": "Banana-na-na-na-na-na-na-na-na-na-na!"
}
Answer, with constr:
{
"preamble": "Knock knock!",
"question": "Who's there?",
"name": "Banana",
"who": "Banana who?",
"punchline": "Banana-na-na-na-" (<-- constr max 16 chars)
}
Plain: 2.50 s, Constr: 2.15 s
~Is there something that will break despite it looking ok in this test?~
EDIT: Yes, something breaks. It cuts short in the first line with constr 8 chars:
Answer, with constr:
{
"preamble": "Knock knock
Apparently the first test with 16 chars just happened to sum up to exactly 16 with the tokens freetext shortcut lead to.
Here's benchmark of the same text generation on Pydantic with
str
andconstr
fields:Relevant code: