DeployQL / LintDB

Vector Database with support for late interaction and token level embeddings.
https://www.lintdb.com/
Apache License 2.0
53 stars 2 forks source link

A user should get back the document they indexed #21

Closed mtbarta closed 6 months ago

mtbarta commented 6 months ago

As a user, I want to store document data, not just the id, so that it can be retrieved and used downstream.

results = index.search(...)
# [{id: 123, score: .98, document: {title: "the fox jumped over..", content: "the fox jumped over..."}...]

Acceptance Criteria

Documents can be flexible structures.

Consider what a document's schema might look like: Here's some docs from milvus. https://milvus.io/docs/multi-vector-search.md

We could support fields within the document, and iteratively add more fields to support.

We should also consider that flexible documents with dynamic schemas will be harder to optimize.

Document fields could be filters

This interactions with #9 . A field could be indexed, stored, or both.