Open merrymercy opened 1 week ago
Reason for the additional space in "outlines w/ jumpforward": The jump forward will jump "ssid"
once. However, the last "
should be tokenized together with :
. This leads to the difference. The solution is to only jump "ssid
.
There are still some issues with jump forward decoding for both backends (outlines and xgrammar). The outputs w/ jump forward are different from the outputs w/o jump forward. I tested the first 10 examples in https://github.com/sgl-project/sglang/tree/main/benchmark/json_schema and found the following issues.
Issues with Outlines
There is an extra space " " before the colon ":" for each key in the json. You can compare the outputs below.
The outputs of outlines w/ jumpforward
The outputs of outlines w/o jumpforward
Issues with xgrammar
Sometimes, the output will include an additional "/" before the string value. For example, in the last line in the following outputs, the output w/ jumpforward for the key "inventorNames" is "/Dr. Alice Smith", but the correct output should be "Dr. Alice Smith". It also happens for the key "health_goals" in the following examples. This is a very critical bug.
The outputs of xgrammar w/ jumpforward
The outputs of xgrammar w/o jumpforward
Generate outputs