Open NanoCode012 opened 1 month ago
Hi! Can you please share the model+schema+prompt that you are trying to use? If this reproduces on a 7B (or less) model it will be much easier to debug.
The formatter seems unable to proceed a generation every time it generates a Roman number.
I am currently generating book names using the qwen1.5-110b-32k model and I found that every time a book name with a roman number is generated, the generation just stops.
Here is an example:
{"实体1": "三体系列", "实体2": "三体Ⅱ
and the generation just stops even the schema hasn't finished yet.
This happens every time so I guess it's to do with the formatter as it doesn't happen when the formatter isn't applied.
Got this exact problem. Any solution or workaround? I'm using Llama-3-8b-instruct and the HF transformers lib to do generation.
Hey! Thank you for the nice tool and integrations. I've been trying this out with English JSON parsing using vllm, and it works great!
However, when I tried with a JP model (like the recently released aya from Cohere and llama3 fine tunes), I received cut off outputs.
result = json.loads(result)
Do you perhaps know why it's occurring? My initial guess after looking at the repo was that it's not able to build a character tree due to these unicode characters and early stopping.
I checked the other Issues, and they are having issues with the key being non-EN. In this case, it's the content itself. I've tried it models without lm-format-enhancer on, and it seems to output ok without cutoff early on (though it can't output JSON consistently as expected).
Env:
vllm==0.4.1 lm-format-enforcer==0.9.8