Closed chrispy-snps closed 1 month ago
Hi @chrispy-snps, HALFVEC
is available in 0.3.0+.
@ankane - thanks for the confirmation! This helped me find the solution.
In the SQLAlchemy page here:
https://github.com/pgvector/pgvector-python/?tab=readme-ov-file#sqlalchemy
it says the following:
Also supports
HALFVEC
,BIT
, andSPARSEVEC
I used my IDE's case completion to find (what I thought was) a type with similar camel casing (HalfVector
) to the example in the docs:
sqlalchemy.Column("vec", pgvector.sqlalchemy.HalfVector(4)),
and this does find a class, but for the vector class and not the type class, and thus an error occurs:
...rest of trace omitted...
ValueError: expected ndim to be 1
When I changed my type to all-capitals (HALFVEC
), then it works:
import sqlalchemy
import pgvector.sqlalchemy
import random
engine = sqlalchemy.create_engine("...")
with engine.begin() as conn:
conn.execute(sqlalchemy.text("CREATE EXTENSION IF NOT EXISTS vector"))
conn.execute(sqlalchemy.text("DROP TABLE IF EXISTS mytable"))
metadata = sqlalchemy.MetaData()
mytable = sqlalchemy.Table(
"mytable",
metadata,
sqlalchemy.Column("id", sqlalchemy.Integer, primary_key=True),
sqlalchemy.Column("vec", pgvector.sqlalchemy.HALFVEC(4)),
# ^^^^^^^
)
metadata.create_all(engine)
with engine.begin() as conn:
conn.execute(
mytable.insert(),
[
{
"id": i,
"vec": [random.random()] * 4,
}
for i in range(3)
],
)
Interestingly both Vector(4)
and VECTOR(4)
seem to work as SQLAlchemy type specifications,
The majority of our vector store database size (80%!) comes from the embedding vectors, not the document data.
We would like to use the
HALFVEC
vector type to reduce database size. However, currently thepgvector.sqlalchemy
package supports only theVECTOR
vector type. It would be great if additional SQLAlchemy types were provided for other vector types.For others interested in this issue, you can create document-only and vector-only versions of your table, then do:
to see the actual database storage requirements for each table.