We currently mostly pass raw LLM outputs, and would benefit from adding more structure to our outputs.
[ ] Initial research and iteration (@alexmoore-iai once you've done a bit of thinking, can you write up quick summary in a ticket and stick a link to it in here)
Goal: Formalise output structures of different Caddy LLM calls so that different pieces of information can be handled programmatically, allowing for more specific prompt curation.
First step would be to update Rewording of advisor message (see link) so that it returns a JSON output, with the following keys:
Legal Topic
Important keywords
Specific Questions asked in the query
Previous attempts to answer question (useful for negative reasoning)
Personal information (i.e. age, nationality)
RAG Retrieval should then be updated to only use relevant keys in this JSON object:
Modify the RAG retrieval algorithm to prioritise the 'Legal Topic' and 'Specific Questions' fields
Create a feedback loop to refine the retrieval process based on the relevance of returned information
It could be possible to replace semantic routing with this step, as legal topic could automatically be used to switch between different prompt templates Implementation details:
Create a mapping between 'Legal Topic' categories and corresponding prompt templates
Could include examples in the prompt to mimic current semantic routing logic.
Other important fields could be fed into the final Caddy prompt to improve output. I.e. “previous attempt to answer question” could be explicitly ignored, preventing negative reasoning. Implementation details:
Integrate 'Previous attempts to answer question' into the prompt construction process, using it to guide the LLM away from repeated or ineffective approaches
Incorporate 'Personal information' to tailor responses to the user's specific context
Benefits:
Storing a more structured version of queries would make downstream analysis easier, for example it would be easy to filter on “legal topic” and check approval rates for different areas.
Ability to improve retrieval by ignoring unnecessary information.
We currently mostly pass raw LLM outputs, and would benefit from adding more structure to our outputs.