unum-cloud / usearch

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
https://unum-cloud.github.io/usearch/
Apache License 2.0
1.92k stars 108 forks source link

Bug: Error in native callback #357

Open Candoyeya opened 4 months ago

Candoyeya commented 4 months ago

Describe the bug

I am working on a Generative AI project with the following characteristics

NodJS API TypeScript @langchain/community HuggingFaceTransformersEmbeddings OpenAIEmbeddings

Recently LangChain made an integration to be able to use USearch as a VectorStore.

Therefore I decided to carry out the following integration with the following examples.

https://js.langchain.com/docs/integrations/vectorstores/usearch

During the first tests I got the following error

Error: `dimensions`, `connectivity`, `expansion_add`, and `expansion_search` must be non-negative integers, with `dimensions` being positive.
     at new Index (/private/var/www/html/generative-ai/node_modules/usearch/javascript/usearch.js:162:19)
     at USearch.addVectors (/private/var/www/html/generative-ai/node_modules/@langchain/community/dist/vectorstores/usearch.cjs:111:27)
     at USearch.addDocuments (/private/var/www/html/generative-ai/node_modules/@langchain/community/dist/vectorstores/usearch.cjs:84:21)
     at processTicksAndRejections (node:internal/process/task_queues:95:5)

Therefore it was necessary to make a patch in the usearch.js file within the javascript folder, this with the intention that it could recognize values of the Bigint type, since the validation of the line 161 did not correctly recognize these values.

This is the patch applied.

if ((typeof dimensionsOrConfigs.dimensions !== 'bigint' && (!Number.isInteger(dimensionsOrConfigs.dimensions) || dimensionsOrConfigs.dimensions <= 0)) ||
     (typeof dimensionsOrConfigs.connectivity !== 'bigint' && (!Number.isInteger(dimensionsOrConfigs.connectivity) || dimensionsOrConfigs.connectivity < 0)) ||
     (typeof expansion_add !== 'bigint' && (!Number.isInteger(expansion_add) || expansion_add < 0)) ||
     (typeof expansion_search !== 'bigint' && (!Number.isInteger(expansion_search) || expansion_search < 0))) {
     throw new Error("`dimensions`, `connectivity`, `expansion_add`, and `expansion_search` must be non-negative integers, with `dimensions` being positive.");
}

Now after being able to correctly validate the values for my index, I get a new error that appears on line 177

this._compiledIndex = new compiled.CompiledIndex(dimensions, metric, quantization, connectivity, expansion_add, expansion_search, multi);

The error is the following

Error: Error in native callback
     at new Index (/private/var/www/html/generative-ai/node_modules/usearch/javascript/usearch.js:188:31)
     at USearch.addVectors (/private/var/www/html/generative-ai/node_modules/@langchain/community/dist/vectorstores/usearch.cjs:120:27)
     at USearch.addDocuments (/private/var/www/html/generative-ai/node_modules/@langchain/community/dist/vectorstores/usearch.cjs:85:21)
     at processTicksAndRejections (node:internal/process/task_queues:95:5)
     at async Function.fromDocuments (/private/var/www/html/generative-ai/node_modules/@langchain/community/dist/vectorstores/usearch.cjs:224:9)

I hope this can serve as a guide in case anyone else is suffering with said integration.

Steps to reproduce

In a new NodeJS environment with TypeScript.

Implement the following example

yarn add usearch
yarn add @langchain/openai @langchain/community

Create a new index from a loader

import { USearch } from "@langchain/community/vectorstores/usearch";
import { OpenAIEmbeddings } from "@langchain/openai";
import { TextLoader } from "langchain/document_loaders/fs/text";

// Create docs with a loader
const loader = new TextLoader("src/document_loaders/example_data/example.txt");
const docs = await loader.load();

// Load the docs into the vector store
const vectorStore = await USearch.fromDocuments(docs, new OpenAIEmbeddings());

// Search for the most similar document
const resultOne = await vectorStore.similaritySearch("hello world", 1);
console.log(resultOne);

Expected behavior

Allow the creation of an index

USearch version

2.9.0

Operating System

Mac OS 14.1.1 (23B81)

Hardware architecture

x86

Which interface are you using?

Other bindings

Contact Details

telematik_4@hotmail.com

Is there an existing issue for this?

Code of Conduct

ashvardanian commented 4 months ago

Thank you, @Candoyeya! I am surprised our tests don't cover that. Can you please open a PR for the first patch and I will take it from there 🤗

Candoyeya commented 4 months ago

Sure, i'll work on it 👍

Candoyeya commented 4 months ago

New PR -> https://github.com/unum-cloud/usearch/pull/359

ashvardanian commented 3 months ago

@Candoyeya, can you please help me with this:

import { TextLoader } from "langchain/document_loaders/fs/text";

Where do I import it from?


I've also had troubles with yarn installation, so assembled a demo repo using only npm and overwriting the USearch version to the most recent one at ashvardanian/usearch-langchain.