pgvector / pgvector-node

pgvector support for Node.js, Deno, and Bun (and TypeScript)
MIT License
335 stars 11 forks source link

pgvector-node

pgvector support for Node.js, Deno, and Bun (and TypeScript)

Supports node-postgres, Knex.js, Objection.js, Kysely, Sequelize, pg-promise, Prisma, Postgres.js, Slonik, TypeORM, MikroORM, and Drizzle ORM

Build Status

Installation

Run:

npm install pgvector

And follow the instructions for your database library:

Or check out some examples:

node-postgres

Enable the extension

await client.query('CREATE EXTENSION IF NOT EXISTS vector');

Register the types for a client

import pgvector from 'pgvector/pg';

await pgvector.registerTypes(client);

or a pool

pool.on('connect', async function (client) {
  await pgvector.registerTypes(client);
});

Create a table

await client.query('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))');

Insert a vector

await client.query('INSERT INTO items (embedding) VALUES ($1)', [pgvector.toSql([1, 2, 3])]);

Get the nearest neighbors to a vector

const result = await client.query('SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 5', [pgvector.toSql([1, 2, 3])]);

Add an approximate index

await client.query('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)');
// or
await client.query('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)');

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Knex.js

Import the library

import pgvector from 'pgvector/knex';

Enable the extension

await knex.schema.createExtensionIfNotExists('vector');

Create a table

await knex.schema.createTable('items', (table) => {
  table.increments('id');
  table.vector('embedding', 3);
});

Insert vectors

const newItems = [
  {embedding: pgvector.toSql([1, 2, 3])},
  {embedding: pgvector.toSql([4, 5, 6])}
];
await knex('items').insert(newItems);

Get the nearest neighbors to a vector

const items = await knex('items')
  .orderBy(knex.l2Distance('embedding', [1, 2, 3]))
  .limit(5);

Also supports maxInnerProduct, cosineDistance, l1Distance, hammingDistance, and jaccardDistance

Add an approximate index

await knex.schema.alterTable('items', function (table) {
  table.index(knex.raw('embedding vector_l2_ops'), 'index_name', 'hnsw');
});

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Objection.js

Import the library

import pgvector from 'pgvector/objection';

Enable the extension

await knex.schema.createExtensionIfNotExists('vector');

Create a table

await knex.schema.createTable('items', (table) => {
  table.increments('id');
  table.vector('embedding', 3);
});

Insert vectors

const newItems = [
  {embedding: pgvector.toSql([1, 2, 3])},
  {embedding: pgvector.toSql([4, 5, 6])}
];
await Item.query().insert(newItems);

Get the nearest neighbors to a vector

import { l2Distance } from 'pgvector/objection';

const items = await Item.query()
  .orderBy(l2Distance('embedding', [1, 2, 3]))
  .limit(5);

Also supports maxInnerProduct, cosineDistance, l1Distance, hammingDistance, and jaccardDistance

Add an approximate index

await knex.schema.alterTable('items', function (table) {
  table.index(knex.raw('embedding vector_l2_ops'), 'index_name', 'hnsw');
});

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Kysely

Enable the extension

await sql`CREATE EXTENSION IF NOT EXISTS vector`.execute(db);

Create a table

await db.schema.createTable('items')
  .addColumn('id', 'serial', (cb) => cb.primaryKey())
  .addColumn('embedding', sql`vector(3)`)
  .execute();

Insert vectors

import pgvector from 'pgvector/kysely';

const newItems = [
  {embedding: pgvector.toSql([1, 2, 3])},
  {embedding: pgvector.toSql([4, 5, 6])}
];
await db.insertInto('items').values(newItems).execute();

Get the nearest neighbors to a vector

import { l2Distance } from 'pgvector/kysely';

const items = await db.selectFrom('items')
  .selectAll()
  .orderBy(l2Distance('embedding', [1, 2, 3]))
  .limit(5)
  .execute();

Also supports maxInnerProduct, cosineDistance, l1Distance, hammingDistance, and jaccardDistance

Get items within a certain distance

const items = await db.selectFrom('items')
  .selectAll()
  .where(l2Distance('embedding', [1, 2, 3]), '<', 5)
  .execute();

Add an approximate index

await db.schema.createIndex('index_name')
  .on('items')
  .using('hnsw')
  .expression(sql`embedding vector_l2_ops`)
  .execute();

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Sequelize

Enable the extension

await sequelize.query('CREATE EXTENSION IF NOT EXISTS vector');

Register the types

import { Sequelize } from 'sequelize';
import pgvector from 'pgvector/sequelize';

pgvector.registerTypes(Sequelize);

Add a vector field

const Item = sequelize.define('Item', {
  embedding: {
    type: DataTypes.VECTOR(3)
  }
}, ...);

Insert a vector

await Item.create({embedding: [1, 2, 3]});

Get the nearest neighbors to a vector

const items = await Item.findAll({
  order: l2Distance('embedding', [1, 1, 1], sequelize),
  limit: 5
});

Also supports maxInnerProduct, cosineDistance, l1Distance, hammingDistance, and jaccardDistance

Add an approximate index

const Item = sequelize.define('Item', ..., {
  indexes: [
    {
      fields: ['embedding'],
      using: 'hnsw',
      operator: 'vector_l2_ops'
    }
  ]
});

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

pg-promise

Enable the extension

await db.none('CREATE EXTENSION IF NOT EXISTS vector');

Register the types

import pgpromise from 'pg-promise';
import pgvector from 'pgvector/pg-promise';

const initOptions = {
  async connect(e) {
    await pgvector.registerTypes(e.client);
  }
};
const pgp = pgpromise(initOptions);

Create a table

await db.none('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))');

Insert a vector

await db.none('INSERT INTO items (embedding) VALUES ($1)', [pgvector.toSql([1, 2, 3])]);

Get the nearest neighbors to a vector

const result = await db.any('SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 5', [pgvector.toSql([1, 2, 3])]);

Add an approximate index

await db.none('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)');
// or
await db.none('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)');

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Prisma

Note: prisma migrate dev does not support pgvector indexes

Import the library

import pgvector from 'pgvector';

Add the extension to the schema

generator client {
  provider        = "prisma-client-js"
  previewFeatures = ["postgresqlExtensions"]
}

datasource db {
  provider   = "postgresql"
  url        = env("DATABASE_URL")
  extensions = [vector]
}

Add a vector column to the schema

model Item {
  id        Int                       @id @default(autoincrement())
  embedding Unsupported("vector(3)")?
}

Insert a vector

const embedding = pgvector.toSql([1, 2, 3])
await prisma.$executeRaw`INSERT INTO items (embedding) VALUES (${embedding}::vector)`

Get the nearest neighbors to a vector

const embedding = pgvector.toSql([1, 2, 3])
const items = await prisma.$queryRaw`SELECT id, embedding::text FROM items ORDER BY embedding <-> ${embedding}::vector LIMIT 5`

See a full example (and the schema)

Postgres.js

Import the library

import pgvector from 'pgvector';

Enable the extension

await sql`CREATE EXTENSION IF NOT EXISTS vector`;

Create a table

await sql`CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))`;

Insert vectors

const newItems = [
  {embedding: pgvector.toSql([1, 2, 3])},
  {embedding: pgvector.toSql([4, 5, 6])}
];
await sql`INSERT INTO items ${ sql(newItems, 'embedding') }`;

Get the nearest neighbors to a vector

const embedding = pgvector.toSql([1, 2, 3]);
const items = await sql`SELECT * FROM items ORDER BY embedding <-> ${ embedding } LIMIT 5`;

Add an approximate index

await sql`CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)`;
// or
await sql`CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)`;

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

Slonik

Import the library

import pgvector from 'pgvector';

Enable the extension

await pool.query(sql.unsafe`CREATE EXTENSION IF NOT EXISTS vector`);

Create a table

await pool.query(sql.unsafe`CREATE TABLE items (id serial PRIMARY KEY, embedding vector(3))`);

Insert a vector

const embedding = pgvector.toSql([1, 2, 3]);
await pool.query(sql.unsafe`INSERT INTO items (embedding) VALUES (${embedding})`);

Get the nearest neighbors to a vector

const embedding = pgvector.toSql([1, 2, 3]);
const items = await pool.query(sql.unsafe`SELECT * FROM items ORDER BY embedding <-> ${embedding} LIMIT 5`);

Add an approximate index

await pool.query(sql.unsafe`CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)`);
// or
await pool.query(sql.unsafe`CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)`);

Use vector_ip_ops for inner product and vector_cosine_ops for cosine distance

See a full example

TypeORM

Import the library

import pgvector from 'pgvector';

Enable the extension

await AppDataSource.query('CREATE EXTENSION IF NOT EXISTS vector');

Create a table

await AppDataSource.query('CREATE TABLE item (id bigserial PRIMARY KEY, embedding vector(3))');

Define an entity

@Entity()
class Item {
  @PrimaryGeneratedColumn()
  id: number

  @Column()
  embedding: string
}

Insert a vector

const itemRepository = AppDataSource.getRepository(Item);
await itemRepository.save({embedding: pgvector.toSql([1, 2, 3])});

Get the nearest neighbors to a vector

const items = await itemRepository
  .createQueryBuilder('item')
  .orderBy('embedding <-> :embedding')
  .setParameters({embedding: pgvector.toSql([1, 2, 3])})
  .limit(5)
  .getMany();

See a full example

MikroORM

Enable the extension

await em.execute('CREATE EXTENSION IF NOT EXISTS vector');

Define an entity

import { VectorType } from 'pgvector/mikro-orm';

@Entity()
class Item {
  @PrimaryKey()
  id: number;

  @Property({type: VectorType})
  embedding: number[];
}

Insert a vector

em.create(Item, {embedding: [1, 2, 3]});

Get the nearest neighbors to a vector

import { l2Distance } from 'pgvector/mikro-orm';

const items = await em.createQueryBuilder(Item)
  .orderBy({[l2Distance('embedding', [1, 2, 3])]: 'ASC'})
  .limit(5)
  .getResult();

Also supports maxInnerProduct, cosineDistance, l1Distance, hammingDistance, and jaccardDistance

See a full example

Drizzle ORM

Drizzle ORM 0.31.0+ has built-in support for pgvector :tada:

Enable the extension

await client`CREATE EXTENSION IF NOT EXISTS vector`;

Add a vector field

import { vector } from 'drizzle-orm/pg-core';

const items = pgTable('items', {
  id: serial('id').primaryKey(),
  embedding: vector('embedding', {dimensions: 3})
});

Also supports halfvec, bit, and sparsevec

Insert vectors

const newItems = [
  {embedding: [1, 2, 3]},
  {embedding: [4, 5, 6]}
];
await db.insert(items).values(newItems);

Get the nearest neighbors to a vector

import { l2Distance } from 'drizzle-orm';

const allItems = await db.select()
  .from(items)
  .orderBy(l2Distance(items.embedding, [1, 2, 3]))
  .limit(5);

Also supports innerProduct, cosineDistance, l1Distance, hammingDistance, and jaccardDistance

See a full example

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/pgvector/pgvector-node.git
cd pgvector-node
npm install
createdb pgvector_node_test
npx prisma migrate dev
npm test

To run an example:

cd examples/loading
npm install
createdb pgvector_example
node example.js