dgarnitz / vectorflow

VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
https://www.getvectorflow.com/
Apache License 2.0
670 stars 47 forks source link

~ added support for docx files to s3 endpoint #59

Closed kpriver55 closed 1 year ago

kpriver55 commented 1 year ago

WHAT

Added support for .docx files to the s3 endpoint of the api.

VERIFICATION

Below proves that the job uploads successfully given an AWS S3 pre-signed URL: Screenshot 2023-09-26 202041

Below proves that the job created by the previous curl request completes successfully: Screenshot 2023-09-26 202114

Lastly, below proves that the unit tests for the api show no exceptions: Screenshot 2023-09-26 202212