stanfordnlp / dspy

DSPy: The framework for programming—not prompting—foundation models
https://dspy-docs.vercel.app/
MIT License
16.68k stars 1.29k forks source link

Implementing Lance Retrieval Plugin #202

Open haydenbspence opened 10 months ago

haydenbspence commented 10 months ago

Pinecone is a great retrieval option, but it does not meet some major needs for researchers.

Lance has the advantage of being able to be used as a flat file similarly to DuckDB, SQLlite, or Parquet. This is important for applications where security is priority like healthcare research.

Lance also has a high level of integrations (Pandas/Polars, PyArrow) which make it ideal for switching between languages and systems (e.g., R-Arrow and PyArrow have zero copy between each other for dataframes). Lance is the default database for anything-llm primarily because it requires zero setup and can be deployed without credential or other configuration. It's actively maintained and has received note by the author of Pandas, Wes McKinney in his blog post on the future of composable data systems which gives me confidence it has staying power in the ecosystem.

okhat commented 7 months ago

I think someone merged this recently, unless I'm mixing things up

GFarnon commented 7 months ago

Hi @okhat - is this right? I wanted to test this, but couldn't find anything.