tensorchord / pgvecto.rs-py

PGVecto.rs Python library
Apache License 2.0
6 stars 1 forks source link

[new feature] filter metadata on field date/datatime #18

Open SebG-js opened 1 week ago

SebG-js commented 1 week ago

Hello,

When I put data in the db, I store embeddings and metadata like filename, ext, size and modification date. In some cases, I want old files or new files close to a query embedding.

When I search documents with python sdk, I use embeddings and some other criteria like the modification date.

Today the filter support only meta_contains. It could be nice to support comparison like meta["date"] <= ... or meta["date"]> ...

For now, I use this workaround : generate a big meta_contains dict

VoVAllen commented 1 week ago

Are you using SQLAlchemy or Psycopg3?

SebG-js commented 1 week ago

I use Psycopg3 :

from pgvecto_rs.sdk import PGVectoRs, Record
from uuid import UUID

# doc: https://github.com/tensorchord/pgvecto.rs-py/blob/main/src/pgvecto_rs/sdk/client.py
class Client:

    def __init__(self, config: dict, table: str):
        url = f'postgresql+psycopg://{config["username"]}:{config["password"]}@{config["host"]}:{config["port"]}/{config["database"]}'
        self.client = PGVectoRs(
            db_url=url,
            collection_name=table,
            dimension=1024,
            recreate=False
        )
VoVAllen commented 1 week ago

It's hard to support all kinds of filter. If you prefer such interface, SQLAlchemy might be a better choice which has full support of such primitives