amosproj / amos2024ss06-health-ai-framework

Ailixir is an application that utilises LLMs and custom user input to generate AI agent prototypes specialised in fields such as health, economics, physics etc. The prototypes enable the user, which is an entrepreneur-developer, to compare the results produced by different LLMs.
MIT License
7 stars 1 forks source link

Configure data storage setup for aquired data #224

Closed tubamos closed 4 days ago

tubamos commented 1 week ago

Notes

This task is closely related with:

Domain

app backend

Description

This task starts with the evaluation of what is technically possible and and then implements the best solution for our use case.

Research and find a way to store the aquired data in a way that will allow: a) embeddings generation and b) the provision of the specific sources that were used by tha agent to produce the anwers to the user.

There might be no need to store raw data in a DB and just having plain JSONs in a directory in HPC or GCP is enough to fulfill these two requirements.

If this is not enough, a DB solution should be configured (MongoDB intance on GCP or Google Firestore).

User Story

As a developer in this team, I need to have lots of aquired data saved at some centralised location so that I can use them to generate embeddings and accurate source tracking for the results the agent will be providing to the end user.

Acceptance Criteria

Definition of Done

manikg08 commented 1 week ago

An instance of Mongo DB has been created on GCP, and some of the JSON files have been saved there. Along with this a new method is found due to which the Mongo DB instance is not needed anymore. The embedding of the scraped data is directly stored on Astra DB.