dgarnitz / vectorflow

VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
https://www.getvectorflow.com/
Apache License 2.0
670 stars 47 forks source link

Remove Unnecessary dependencies from API #112

Closed dgarnitz closed 5 months ago

dgarnitz commented 5 months ago

We should remove dependencies that are not being used. In particular, we should 1) remove the llama hub connectors as almost none of them are being used. The only one being used is for Markdown. We should swap this for a much smaller markdown to text loader 2) remove libraries like langchain, llama index, unstructured, langsmith, torch, transformers as dependencies

dgarnitz commented 5 months ago

resolved by https://github.com/dgarnitz/vectorflow/pull/114