noamgat / lm-format-enforcer

Enforce the output format (JSON Schema, Regex etc) of a language model
MIT License
1.42k stars 65 forks source link

Combine regex and json parser #103

Closed remichu-ai closed 4 months ago

remichu-ai commented 4 months ago

Let me first start by saying this is awesome work and i think the is the best format enforcer currently.

Currently i am using this library to improve functional calling and i am trying to recreate openai api where if it think that tool usage is not needed, it will just return normal chat message.

I wonder if it is possible to do the following:

My plan for now is to do this via 2 prompt, first prompt to get whether tool usage is needed and 2nd prompt is to enforce the relevant format.

However i wonder if doing it in one prompt achievable. Any pointer to the right direction would help

noamgat commented 4 months ago

Interesting idea. If the idea is to create a response that is either a normal response (any string) or a tool response (json schema), you could perhaps do a UnionParser between "anything" (that could be a .* RegexParser) and the json schema of the tools that you want.

I'm not sure that in the current state of the library its possible with one prompt, because if you use json schema parser, even the "normal chat message" would have to be a json string (so it would have the open and close quotation marks.

remichu-ai commented 4 months ago

Thank you for the advice, i will try it out.

Currently i nested a normal answer and a tool response using Union inside a regex (basically wrap things in another json layer so that i can use Union in the json enforcer), in this json i also included a field internal_thought to simulate chain of thought prompting.

It is working but it added another json layer increase complexity for the model which i felt the response is not as good. I will be testing out 2 prompts approach also and compare the result and speed.