jina-ai / annlite

⚡ A fast embedded library for approximate nearest neighbor search
Apache License 2.0
216 stars 22 forks source link

refactor: remove docarray dependency #229

Open JoanFM opened 1 year ago

numb3r3 commented 1 year ago

Maybe we can define a simple pydantic model as the basic data structure:

class Document(BaseModel):
    id: str = Field(..., required=True)
    embedding: Union[nd.array, list[float]] = Field(..., required=True)
    meta_data: Dict = Field(..., default = {})

    def validator():
          # validate the shape of the embedding
          ...

class Query(BaseModel):
    ....

class Result(BaseModel):
    ....

Anyway, this idea can be in a new PR instead.

JoanFM commented 1 year ago

@numb3r3 I sitll want to do a change which is to set the key from which to extract the embedding, not always rely on embedding.

SInce my idea is to expose via DocArray, I will not use Pydantic for this since the validation would come from DocArray.

The idea here is that here it remains focused only on ANN search, nothing else