`fast_reply` hook - Githubissues

pieroit commented 1 month ago

Hook agent_fast_reply is useful after recall, to plug a custom agent.

I propose an hook fast_reply which shortcircuits both recall and agent. This is useful to:

canned replies
send stuff to the Cat via websocket when a reply is not really needed
launch custom LLM chains / agents / pipelines (i.e. launching directly the LLM with a simple prompt, which is very much requested)

lucagobbi commented 1 month ago

That's cool! I'm making some experiments out of it, two questions arised:

What about conversation history both in working and in episodic memory? Should we keep it? I think we should, even if someone could want to avoid certain messages to pass, hence to be stored in memory as well... That should be hookable as well?
What about the why response? Knowing that the fast_reply hook was triggered and it shortcircuited indeed the response is important in this case. That's what we were discussing in #864

pieroit commented 1 month ago

That's cool! I'm making some experiments out of it, two questions arised:

What about conversation history both in working and in episodic memory? Should we keep it? I think we should, even if someone could want to avoid certain messages to pass, hence to be stored in memory as well... That should be hookable as well?

Agree let's keep it, later on if it is requested we may add a boolean to the websocket message to let the client decide what gets stored and what not @zAlweNy26 proposed something similar to this few months ago

What about the why response? Knowing that the fast_reply hook was triggered and it shortcircuited indeed the response is important in this case. That's what we were discussing in Store intermediate prompts, replies, and token count for each message - and send back to the client #864

I propose to leave the why as is, if in fast_reply the dev calls recall or stores stuff in working_memory collections, the why will be populated. Regarding #864, the report should also be populated automatically by using cat.llm or the embedder Am I missing something?

Thanks!!!

cheshire-cat-ai / core

`fast_reply` hook #868