MuleSoft-AI-Chain-Project / mac-vectors

MuleChain Vectors Connector
https://mulechain.ai/docs/mulechain-vectors/connector-overview
MIT License
1 stars 3 forks source link

[Suggestion] Adding different options for uploading files to a Vector Database #12

Open v-j-jpg opened 1 week ago

v-j-jpg commented 1 week ago

Hi everyone,

I've set up the MAC vectors connector and deployed it to Cloud Hub 2.0. While uploading the files, I noticed that the connector can only read files from a local file path. To do it, I needed to write my files to the /tmp folder in Cloud Hub and delete them after the upload. Example of the flow below:

image

It would be nice if the connector could offer to directly upload files from for example Azure/AWS bucket/Google Drive. Since writing in the 'tmp 'folder has its limitations, and since you don't know how much free space there actually is and your files can sometimes be larger, you risk crashing the pod when it runs out of disk space

If the files are already in an S3 bucket the flow would look like this

image

which uses a lot of resources instead of loading the files directly

amirkhan-ak-sf commented 6 days ago

@v-j-jpg, @alick888 has implemented the S3 bucket file drop for MAC Vectors on the "add/S3" branch.

Would you mind testing it?