Zipstack / unstract

No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
https://unstract.com
GNU Affero General Public License v3.0
352 stars 29 forks source link

feat: Update PostgreSQL image to include pgvector extension #426

Open mutaiib opened 5 days ago

mutaiib commented 5 days ago

What

Update PostgreSQL image to include pgvector extension.

Why

The current image under docker/docker-compose-dev-essentials:servides:db does not come with the pgvector extension enabled, halting the process of using PostgreSQL as the vector database.

How

Replaced the postgres:15.6 image with pgvector/pgvector:pg15.

Can this PR break any existing features? If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No, since the image installs the pgvector and runs on the supported version of PostgreSQL.

Database Migrations

N/A

Env Config

N/A

Relevant Docs

pgvector GitHub

Related Issues or PRs

N/A

Dependencies Versions

N/A

Notes on Testing

N/A

Screenshots

N/A

Checklist

I have read and understood the [Contribution Guidelines]().

CLAassistant commented 5 days ago

CLA assistant check
All committers have signed the CLA.

hari-kuriakose commented 3 days ago

@mutaiib Thanks for the contribution!

This could definitely help when the user wants to use the same PostgreSQL instance both for Unstract backend metadata storage and as the Vector DB.

However, wanted to confirm one thing though. After changing default Unstract db to pgvector provided PostgresSQL instance:

NOTE: Later we could use new pgvectorscale extension which is said to be more efficient and performant. It should become possible once LlamaIndex supports it.

mutaibsha commented 3 days ago

@mutaiib Thanks for the contribution!

This could definitely help when the user wants to use the same PostgreSQL instance both for Unstract backend metadata storage and as the Vector DB.

However, wanted to confirm one thing though. After changing default Unstract db to pgvector provided PostgresSQL instance:

  • Did the app run successfully?
  • Were you able to add the same instance as a Vector DB too?

NOTE: Later we could use new pgvectorscale extension which is said to be more efficient and performant. It should become possible once LlamaIndex supports it.

Yup, both. I was not only able to run the application, I ran and tested it using the postgres as the vector DB integration.

BTW, ill have a look into the pgvectorscale