pgvector / pgvector-python

pgvector support for Python
MIT License
979 stars 63 forks source link

Migrating From `VectorField` -> `HalfVectorField` #73

Closed rsomani95 closed 5 months ago

rsomani95 commented 5 months ago

Is migrating from vector -> half vector fields supported? I'm on the main branch on commit id 8be15b29e90082b01a173503624659437035cde5 and this is how I have my models setup:

from pgvector.django import VectorField, HalfVectorField
import numpy as np

class CustomVectorField(VectorField):
# class CustomVectorField(HalfVectorField):
    def to_python(self, data):
        if isinstance(data, list):
            return np.array(data, dtype=np.float32)
        else:
            return super().to_python(data)

class Segment(Model):
    embedding = CustomVectorField(dimensions=640, null=True)

I changed the above CustomVectorField to inherit from HalfVectorField instead of VectorField, expecting to be able to migrate the embeddings, but running makemigrations throws the following error:

Exception: Don't know how to convert the Django field Segment.embedding

I'm curious if I need to run custom SQL to do this migration, or is this not supported by pgvector yet?

ankane commented 5 months ago

Hi @rsomani95, you can change the type of a column in SQL with:

ALTER TABLE segment ALTER COLUMN embedding TYPE halfvec(640);

However, if you have a lot of rows and can't tolerate downtime, you'll want to migrate in a safer way (which uses a new column instead).