Closed tubamos closed 4 days ago
An instance of Mongo DB has been created on GCP, and some of the JSON files have been saved there. Along with this a new method is found due to which the Mongo DB instance is not needed anymore. The embedding of the scraped data is directly stored on Astra DB.
Notes
This task is closely related with:
226
Domain
app backend
Description
This task starts with the evaluation of what is technically possible and and then implements the best solution for our use case.
Research and find a way to store the aquired data in a way that will allow: a) embeddings generation and b) the provision of the specific sources that were used by tha agent to produce the anwers to the user.
There might be no need to store raw data in a DB and just having plain JSONs in a directory in HPC or GCP is enough to fulfill these two requirements.
If this is not enough, a DB solution should be configured (MongoDB intance on GCP or Google Firestore).
User Story
As a developer in this team, I need to have lots of aquired data saved at some centralised location so that I can use them to generate embeddings and accurate source tracking for the results the agent will be providing to the end user.
Acceptance Criteria
Definition of Done