noamgat / lm-format-enforcer

Enforce the output format (JSON Schema, Regex etc) of a language model
MIT License
1.42k stars 65 forks source link

Training? #77

Open xdevfaheem opened 7 months ago

xdevfaheem commented 7 months ago

@noamgat is it possible to train a model with constrained output tokens?

noamgat commented 7 months ago

While I have not done this personally, it should be possible. You could devise a "legal token discipline loss" in which you look at the output that the LLM generates, and penalize it according to the weights it assigns to the illegal tokens.

So you would do something like:

The -inf kind of scares me, so you might want to replace the -inf with a negative number (like -100 which is the maximum in OpenAI API logits bias). This is all theoretical though, I haven't tried it in practice.

xdevfaheem commented 7 months ago

Cool, this makes sense. Is there any scripts available somewhere as of you your knowledge to get s quick overview of the implementation?

xdevfaheem commented 7 months ago

I tried, training the model with preferred json output for the input without any format enforcers. it performs good. Eventhough, i wants to use LMFE to increase it's reliability. Am i wrong?

noamgat commented 7 months ago

There are no scripts that try to do this as far as I'm aware, this is brainstorming... Indeed, normal SFT training can work as well, and then you don't need LMFE in the training loop