Closed YRG999 closed 5 months ago
The response from a generate call will return a context object. To continue the conversation, you can pass this directly into the next call:
r1 = ollama.generate(model='llama2', prompt='hi!')
r2 = ollama.generate(model='llama2', prompt='hi again!', context=r1['context'])
Context is the token representations of the inputs and outputs up to and including the response
How can I use the
context
parameter from the API to keep a conversational memory?