Closed frieda-huang closed 2 weeks ago
Hi @frieda-huang, for arrays with SQLAlchemy, you'll need to call register_vector
on the underlying adapter. For Psycopg 2, use:
from pgvector.psycopg2 import register_vector
with engine.connect() as connection:
register_vector(connection.connection.dbapi_connection, globally=True, arrays=True)
Added a test case in the commit above.
Hi @frieda-huang, for arrays with SQLAlchemy, you'll need to call
register_vector
on the underlying adapter. For Psycopg 2, use:from pgvector.psycopg2 import register_vector with engine.connect() as connection: register_vector(connection.connection.dbapi_connection, globally=True, arrays=True)
Added a test case in the commit above.
Thank you for the quick reply! I'm still getting the same error despite calling register_vector_async
. I'm using psycopg3:
async def add(self, vector_embedding: List[npt.NDArray], page: Page) -> Embedding:
conn = await psycopg.AsyncConnection.connect(dbname=DBNAME, autocommit=True)
async with conn:
await register_vector_async(conn)
embedding = Embedding(
vector_embedding=vector_embedding,
page=page,
last_modified=get_now(),
created_at=get_now(),
)
self.session.add(embedding)
return embedding
Error:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/sqlalchemy/sql/sqltypes.py", line 3144, in <genexpr>
return collection_callable(itemproc(x) for x in arr)
^^^^^^^^^^^
File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/pgvector/sqlalchemy/halfvec.py", line 33, in process
return HalfVector._from_db(value)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/pgvector/utils/halfvec.py", line 71, in _from_db
return cls.from_text(value)
^^^^^^^^^^^^^^^^^^^^
File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/pgvector/utils/halfvec.py", line 36, in from_text
return cls([float(v) for v in value[1:-1].split(',')])
^^^^^^^^
ValueError: could not convert string to float: ''
For Psycopg 3, you'll need to call it on the connection used for the session (self.session.connection()
- rather than a new connection).
The best way to do this would be to use the connect
event right after you define the engine. For Psycopg 3:
from pgvector.psycopg import register_vector
from sqlalchemy import event
@event.listens_for(engine, "connect")
def connect(dbapi_connection, connection_record):
register_vector(dbapi_connection)
The best way to do this would be to use the
connect
event right after you define the engine. For Psycopg 3:from pgvector.psycopg import register_vector from sqlalchemy import event @event.listens_for(engine, "connect") def connect(dbapi_connection, connection_record): register_vector(dbapi_connection)
Is there a way we can do it using async? I'm using it along with FastAPI. Having tried multiple approaches including adding the register_vector logic in FastAPI's lifespan, still got the same error :(
If you're using create_async_engine
, you'll want to use:
from pgvector.psycopg import register_vector_async
@event.listens_for(engine.sync_engine, "connect")
def connect(dbapi_connection, connection_record):
dbapi_connection.run_async(register_vector_async)
If you're using
create_async_engine
, you'll want to use:from pgvector.psycopg import register_vector_async @event.listens_for(engine.sync_engine, "connect") def connect(dbapi_connection, connection_record): dbapi_connection.run_async(register_vector_async)
Thank you! It works now!
Hi! I keep getting
ValueError: could not convert string to float: ''
due to the values in my halfvec column being malformed.This is what my halfvec column looks like:
I also made sure to convert each vector using
np.array
before upsertvector_embedding = [np.array(e) for e in embeddings]
Interestingly, the other field
vector_embedding: Mapped[np.array] = mapped_column(HALFVEC(VECT_DIM))
in a separate table works just fine. Am I missing something when using array halfvec?