Closed shreyashankar closed 8 months ago
We can have 4 functions:
Todo:
gen
and agen
gen
and agen
We also may need to handle the case where the generated result is cached, so if we read from the cache we'll have to return a generator over the results.
As title -- so we can stream LLM responses.
We'll need to check in the executor if the return type is a generator, if so, we should yield the results, and then cache a list of all the results.