Open jmont-dev opened 3 months ago
I figured out how to do in manual way. First capture the context. inline nlohmann::json Keepcontext; if (response.as_json()["done"]==true) { prompt_AI.busy=false; if (response.as_json().contains("context")){ Keepcontext = response.as_json()["context"]; } } Then you put it in request in the next time.
ollama::request request(ollama::message_type::generation); if (!Keepcontext.empty()){ request["context"] = Keepcontext; }
This was addressed in #23. You can now include previous responses to provide context when using the generate
endpoint. See the sections added to the Readme on handling context and context length: https://github.com/jmont-dev/ollama-hpp?tab=readme-ov-file#handling-context.
Context tokens are supplied from the
generate
endpoint. Allow supplying these as an input to additional generations to allow responses that span multiple generations.