eth-sri / lmql

A language for constraint-guided and efficient LLM programming.
https://lmql.ai
Apache License 2.0
3.64k stars 197 forks source link

CPU and memory leak #256

Open WhoIsDarth opened 11 months ago

WhoIsDarth commented 11 months ago

After running the service for a long time, I get a large number of running processes.

result: list[LMQLResult | str] = await lmql.run(
            prompt,
            model=model,
            output_writer=output_writer,
            **additional_context,
            **script_context,
        )
raw_response = result[0]
Screenshot 2023-10-24 at 13 38 47

I am dynamically generate prompt scripts and pass them into lmql.run

lbeurerkellner commented 11 months ago

Thanks for reporting this. I will have to investigate a bit, to reproduce it well on my end.

As a workaround, depending on what you do, you could also just pass a prompt value as a parameter to a query that does not change over time (assuming you have no dynamic constraints).

WhoIsDarth commented 11 months ago

I do not control the format of the prompt, users send their prompts and I execute them

iEgit commented 9 months ago

seems to reproduce when it has long output (similar to the problem in https://github.com/eth-sri/lmql/issues/273) leading to errors and multiple retries of open ai due to extremely long context.