These two APIs allow users to decompose lf.query into two stages:
1) lf.query_prompt: get the final prompt that will be sent to the LLM.
2) lf.query_output: get the structured output from LLM response.
With these two APIs, users could easily implement lf.query with batch LLM inferences (e.g. jax-on-beam).
Introduce
lf.query_prompt
andlf.query_output
.These two APIs allow users to decompose
lf.query
into two stages: 1)lf.query_prompt
: get the final prompt that will be sent to the LLM. 2)lf.query_output
: get the structured output from LLM response.With these two APIs, users could easily implement
lf.query
with batch LLM inferences (e.g. jax-on-beam).