Pinecone is a great retrieval option, but it does not meet some major needs for researchers.
Lance has the advantage of being able to be used as a flat file similarly to DuckDB, SQLlite, or Parquet. This is important for applications where security is priority like healthcare research.
Lance also has a high level of integrations (Pandas/Polars, PyArrow) which make it ideal for switching between languages and systems (e.g., R-Arrow and PyArrow have zero copy between each other for dataframes). Lance is the default database for anything-llm primarily because it requires zero setup and can be deployed without credential or other configuration. It's actively maintained and has received note by the author of Pandas, Wes McKinney in his blog post on the future of composable data systems which gives me confidence it has staying power in the ecosystem.
Pinecone is a great retrieval option, but it does not meet some major needs for researchers.
Lance has the advantage of being able to be used as a flat file similarly to DuckDB, SQLlite, or Parquet. This is important for applications where security is priority like healthcare research.
Lance also has a high level of integrations (Pandas/Polars, PyArrow) which make it ideal for switching between languages and systems (e.g., R-Arrow and PyArrow have zero copy between each other for dataframes). Lance is the default database for anything-llm primarily because it requires zero setup and can be deployed without credential or other configuration. It's actively maintained and has received note by the author of Pandas, Wes McKinney in his blog post on the future of composable data systems which gives me confidence it has staying power in the ecosystem.