eth-sri / lmql

A language for constraint-guided and efficient LLM programming.
https://lmql.ai
Apache License 2.0
3.64k stars 197 forks source link

Use external files and avoid quoting? #341

Open benwhalley opened 6 months ago

benwhalley commented 6 months ago

I wonder if the docs could do with a concise statement of all the quoting rules for run, F, and query? I'm still a bit confused after playing with it for a while, and I think there might be a case for simplifying for some uses.

As an example of my use case, I have domain experts writing quite long prompts to interpret and summarise various data sources. I want to encourage collaboration with them in writing the prompts, but this means exposing at least a subset of lmql to them and hoping they don't break it. This isn't untrusted input (it will be checked) but it's still a pain to ask them to have to quote every individual line in a programme.

What I'd like to do is have myprompt.lmql as a separate file, containing instructions and placeholders for variable substitution and outputs. Then, the application code would load the template and use it to query at runtime.

So myprompt.lmql might be

This is some data:

{data}

{ugly_statistical_outputs}

These are specific instructions for interpreting it:
- do x
- don't do y
- always remember z
- use technical language if you like

[INTERPRETATION]

Now we need to summarise this for lay people.
Describe the main findings:

[LAY_SUMMARY]

At present there doesn't seem to be an obvious way to read in a file like this and use it with lmql.query. This works, but seems ugly:

q = "\\n".join([i.rstrip() for i in open(template, 'r').readlines()])    
f = lmql.F(q, is_async=False, model=azm, decoder='sample')

It also doesn't allow me to introspect the prompts at each step as query would.

Perhaps this is orthogonal to the goals of your project, but I came to LMQL because I had an itch to write a very simple markdown-based DSL that end users could use without much training and figured someone else might have done so already.

I'm also interested in writing llm prompts to generate llm prompts which incorporate best practices (e.g. use of COT, personas) etc which domain-experts might not use on their own. Having a nice templating system would make this easier.

Anyway - thanks for all the effort on lmql — it seems like a really neat project.