ministryofjustice / analytical-platform

Analytical Platform • This repository is defined and managed in Terraform
https://docs.analytical-platform.service.justice.gov.uk
MIT License
9 stars 4 forks source link

Provide storage mechanism for applications #5097

Open townxelliot opened 3 weeks ago

townxelliot commented 3 weeks ago

Describe the feature request.

Applications in the Cloud Platform could potentially use resources in the Analytical Platform: for example, S3 buckets and Bedrock models. However, one common need for such applications is long-term storage: an application may need somewhere to store user preferences or other data supplied by users, application configuration and preferences etc.

The data storage required may be relational (postgres) or NoSQL (dynamodb), depending on the application.

At present, the Analytical Platform does not provide a robust solution for this.

Describe the context.

An application under development in MoJ R&D stores user interactions with the system so that they can be triaged by back office staff. Specifically, the user follows a workflow like:

  1. User asks a question in natural language about HR policies.
  2. The API uses a search engine over HR documents, combined with an LLM, to supply an answer.
  3. The user may reject or accept the answer.
  4. If the answer is rejected, the user may rephrase the question and submit it to the back office team, along with an email address the answer can be sent to.
  5. The back office HR team uses an admin interface to supply answers manually where necessary, sending them to provided email addresses.

It is useful to store the interaction (initial question, generated answer, user acceptance (or not) of the answer, rephrased question sent to back office team etc.) to analyse the kinds of questions users supply (to tune the LLM), see how often generated answers are accepted (to measure the effectiveness of the solution), and enable back office staff to manually respond to questions for which generated answers are unacceptable.

Value / Purpose

At present, we are not able to access resources from the Cloud Platform and resources from the Analytical Platform using the same service account. The most important part of our current application is Bedrock (for the LLM), so we really need to use an Analytical Platform service account. But by choosing this platform, we lose the ability to store data across redeployments. This impacts on our ability to run demos in series and use the data from each to inform future iterations.

User Types

stakeholders in R&D experiments (R&D team, end users, back office staff)

sheilz81 commented 3 weeks ago

Hiya my team would also benefit from this as we are also working on a tool where the user needs to check data and feedback if they need to investigate further or not. We are currently using iceberg but finding it very slow.