vectara / vectara-ingest

An open source framework to crawl data sources and ingest into Vectara
https://vectara.com
Apache License 2.0
147 stars 50 forks source link

enhancement: add support for Arguflow #43

Closed cdxker closed 6 months ago

cdxker commented 1 year ago

This is the Arguflow team’s submission to the “best contribution to vectara-ingest” part of the hackathon. We really admire the work that has gone into this repository, and want to start a trend of making it compatible with more varied services.

This PR adds support for Arguflow to the crawler such that users are able to add documents/chunks to Arguflow, Vectara, or both.

Internally, we were motivated to add this support so that we can stand up more Arguflow demos by using the crawlers, however, we are PR’ing it because we think it can also offer Vectara users value.

Arguflow has support for a few things that Vectara does not which users may desire:

This repository is great, and we really admire @ofermend’s work on it especially. Excited to enhance it and bring it more into the lens of the open source AI world! We also edited the documentation in all the right places, let us know if may have missed a few spots.

Happy to address any review comments or change requests in a timely manner.

ofermend commented 11 months ago

Hi there!  Thank you for offering the contribution.  However, the vectara-ingest project is not intended to be a general-purpose ETL tool for arbitrary destinations.  There are a variety of generic ETL tools on the market, and the reason vectara-ingest is separate is in the name: to optimize data loading into Vectara.  A couple thoughts:

eskibars commented 6 months ago

Closing this PR as won't merge