CircleCI-Public / cimg-postgres

MIT License
8 stars 25 forks source link

Add pgvector support #112

Open kendagriff opened 11 months ago

kendagriff commented 11 months ago

Description

Adds support for pgvector, an extension essential to working with GPT4, and isn't included natively in contrib.

Reasons

I haven't been able to find an effective way to add pgvector support to CircleCI without a custom image—this PR brings pgvector support of the box. pgvector is useful for training LLMs, e.g. GPT4, adding embeddings and similarity queries to Postgres.

An embedding is a vector (list) of floating point numbers. The between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.

https://platform.openai.com/docs/guides/embeddings/what-are-embeddings

NOTE: I added the installation steps for pgvector to the initial RUN as apt-get purge removes clang, which is necessary for make install.

Checklist

kendagriff commented 11 months ago

@ryanbourdais Any interest in this?

kendagriff commented 10 months ago

@JalexChen: Bumping this again to see if there's any interest.

BrandonMathis commented 5 months ago

I would be quite interested in seeing this merged. I've started doing a lot of work with storing vector embeddings in postgres and currently use this docker image in my CircleCI builds.

Any tips on how i can switch to an image build with the code in this PR to test it out? I currently have this in my CircleCI config file

      - image: cimg/postgres:12.15
        environment:
          POSTGRES_USER: 'user'
          POSTGRES_DB: 'test'
          POSTGRES_PASSWORD: ''