redis / node-redis

Redis Node.js client
https://redis.js.org/
MIT License
16.86k stars 1.87k forks source link

Struggling to invoke KNN vector similarity search #2351

Closed curran closed 1 year ago

curran commented 1 year ago

It would be great to have a small working example for vector similarity search similar to the search-hashes example.

I'm struggling to figure out how to encode floating point arrays in the correct way to store in Redis, and also how to represent the vector to search by in the query.

Here's some wreckage showing various things I attempted:

import fs from 'fs';
import redis from 'redis';
// This is an array of 512 floating point numbers
import { embedding } from './sampleEmbedding.js';
const { createClient, SchemaFieldTypes, commandOptions, print } = redis;

// See
// https://redis.js.org/
const client = createClient();
client.on('error', (err) => console.log('Redis Client Error', err));
await client.connect();

const idx = 'idx:vector';

// See
// https://redis.io/docs/stack/search/reference/vectors/
// From https://github.com/redis/node-redis/blob/master/examples/search-hashes.js#L10
try {
  await client.sendCommand(['FT.DROPINDEX', idx]);
  await client.sendCommand([
    'FT.CREATE',
    idx,
    'SCHEMA',
    'vector',
    'VECTOR',
    'HNSW',
    '6',
    'TYPE',
    'FLOAT32',
    'DIM',
    '512',
    'DISTANCE_METRIC',
    'COSINE',
  ]);
} catch (e) {
  if (e.message === 'Index already exists') {
    console.log('Index exists already, skipped creation.');
  } else {
    // Something went wrong, perhaps RediSearch isn't installed...
    console.error(e);
    process.exit(1);
  }
}
const id = '84790q8758493d';
await client.hSet(
  id,
  'vector',
  Buffer.from(new Float32Array(embedding).buffer) // No idea if this is correct or not
);
const result = await client.hGetAll(id);

// It outputs a crazy looking string with strange characters. Expected I think?
console.log(result.vector);

// From
// https://stackoverflow.com/questions/40031688/javascript-arraybuffer-to-hex
function bufferToHex(buffer) {
  return [...new Uint8Array(buffer)]
    .map((b) => '\\x' + b.toString(16).padStart(2, '0'))
    .join('');
}

// Trying to construct the BLOB string
// that Redis expects in its commands (no idea if this is correct)
const blob = (array) => bufferToHex(new Float32Array(array).buffer);

console.log(
  await client.sendCommand([
    'FT.SEARCH',
    idx,
    '*=>[KNN 10 @vec $BLOB]',
    'PARAMS',
    '2',
    'BLOB',
    blob(embedding), // I feel like this is wrong - there must be a better way?
    'DIALECT',
    '2',
  ])
);

// Prints [ 0 ], disappointingly. Should contain a single document.
console.log(results);

await client.disconnect();

Any guidance would be greatly appreciated. Thanks in advance!

curran commented 1 year ago

I've taken a stab at the example, but I'm at a loss for how to get it to work from here: https://github.com/redis/node-redis/pull/2352/files

curran commented 1 year ago

The Python tests for vector similarity search may be useful in constructing a working JS example: https://github.com/RediSearch/RediSearch/blob/06e36d48946ea08bd0d8b76394a4e82eeb919d78/tests/pytests/test_vecsim.py

curran commented 1 year ago

Hey I got it to work! in https://github.com/redis/node-redis/issues/2351

Also, this is not a "bug", it's a documentation request. I'd remove the "bug" label if I could, but it looks like I can't.

leibale commented 1 year ago

Yea we definitely need a "documentation issue" template (and to be honest the two existing templates need an update as well)

edit: opened an issue about that #2353

chayim commented 1 year ago

@dvora-h @sazzad16 @vladvildanov @shacharPash you all may want to take a look at this as well, for your clients.