Closed mayooear closed 1 year ago
also getting this, looking into
acutally i lied - i had a different error. the errors are pretty opaque so took me a while to debug. but never ran into this issue. could you try with a dummy docs
value like [new Document({ pageContent: "foo" })]
Hmm, my "docs" are derived from textSplitter.createDocuments([text])
so in the console.log it's already in required format as per above.
Upon further investigation, the error partially matches the pinecone's ts client upsertRaw
function in the codebase as below:
async upsertRaw(requestParameters: UpsertOperationRequest, initOverrides?: RequestInit | runtime.InitOverrideFunction): Promise<runtime.ApiResponse<UpsertResponse>> {
if (requestParameters.upsertRequest === null || requestParameters.upsertRequest === undefined) {
throw new runtime.RequiredError('upsertRequest','Required parameter requestParameters.upsertRequest was null or undefined when calling upsert.');
}
It appears that langChain failed to pass the requestParameters
to the function , which consists of the vector and namespace.
Debugging further, I noticed there may an error when the uuid
is generated in langchain's pinecone.ts code function called addVectors
:
async addVectors(
vectors: number[][],
documents: Document[],
ids?: string[]
): Promise<void> {
const documentIds = ids == null ? documents.map(() => uuidv4()) : ids;
await this.pineconeClient.upsert({
upsertRequest: {
vectors: vectors.map((values, idx) => ({
id: documentIds[idx],
metadata: {
...documents[idx].metadata,
[this.textKey]: documents[idx].pageContent,
},
values,
})),
namespace: this.namespace,
},
});
}
//error
Exception has occurred: TypeError: Cannot assign to read only property 'name' of function 'function generateUUID(value, namespace, buf, offset) {
var _namespace;
I tested index.upsert
separately without langchain and it works.
What errors do you see on your end?
@mayooear could you confirm which version of @pinecone-database/pinecone
you are using in your project? We only support the most recent version released 2 days ago 0.0.9
.
I have added some tests in #166 but couldn't reproduce your issue. Do you want to have a look at the tests and let me know if you can spot what you're doing differently?
Yes i'm on 0.0.9 for pinecone and langchain 0.0.11. I also got a similar error trying to run VectorDBQA call
.
Looking at your tests, the syntax of operations is different from the langchainjs docs:
const pinecone = new PineconeClient();
await pinecone.init({
environment: "us-west1-gcp",
apiKey: "apiKey",
});
const index = pinecone.Index("my-index");
// this is the cause of the error //
const vectorStore = await PineconeStore.fromDocuments(
index,
docs,
new OpenAIEmbeddings()
);
Whereas your tests:
Ah, I think the issue is the version of langchain
. version 0.0.11 was using a different library for the pinecone client, and recently we've changed to use the official pinecone client. If you update langchain the issue should go away. If not let me know
(FYI the syntax in the tests is equivalent to yours)
Ah, I think the issue is the version of
langchain
. version 0.0.11 was using a different library for the pinecone client, and recently we've changed to use the official pinecone client. If you update langchain the issue should go away. If not let me know(FYI the syntax in the tests is equivalent to yours)
Yeh, I upgraded and it crashed my test app. I spent a couple of hours debugging, here's what I've found so far:
esm
, to failure to recognize .ts
extensions and so on.Perhaps you can advise on the appropriate ts-config settings, but I tried to use the exact ones used in the repo example
section.
The most recent version is now ESM only. In order to work with it your project needs to have "type": "module"
in its package.json.
I'd recommend also changing your tsconfig to have
"target": "ES2020",
"module": "nodenext",
``` like here https://github.com/hwchase17/langchainjs/blob/main/examples/tsconfig.json
Other than that, if you're using Node 18 or 19 it should work without additional changes. If you're using Node 16 check the instructions here https://hwchase17.github.io/langchainjs/docs/getting-started/#installation
Let me know if that works
The most recent version is now ESM only. In order to work with it your project needs to have
"type": "module"
in its package.json.I'd recommend also changing your tsconfig to have
"target": "ES2020", "module": "nodenext", ``` like here https://github.com/hwchase17/langchainjs/blob/main/examples/tsconfig.json Other than that, if you're using Node 18 or 19 it should work without additional changes. If you're using Node 16 check the instructions here https://hwchase17.github.io/langchainjs/docs/getting-started/#installation Let me know if that works
Thanks. I rebuilt the repo from scratch using your specs and using the latest version of langchain (with pinecone 0.0.8). I installed all packages using yarn
.
await PineconeStore.fromDocuments
works as expected now.
I attempted the vectordbqa chain method which failed. As per below it threw an error with regards to res.metadata
when the similarity search function is run:
error:
error TypeError: Cannot destructure 'res.metadata' as it is undefined.
at PineconeStore.similaritySearchVectorWithScore
code:
const model = new OpenAI({});
/* Initialize Pinecone client*/
const pinecone = new PineconeClient();
//initialize the vectorstore to store embeddings
await pinecone.init({
environment: `${process.env.PINECONE_ENVIRONMENT}`,
apiKey: `${process.env.PINECONE_API_KEY}`,
});
// retrieve API operations for index created in pinecone dashboard
const index = pinecone.Index("index");
console.log("index", index);
try {
// /* Create the vectorstore */
const vectorStore = await PineconeStore.fromExistingIndex(
index,
new OpenAIEmbeddings(),
"text",
"test"
);
console.log("vectorstore", vectorStore);
//error
const resultOne = await vectorStore.similaritySearch("president", 3);
console.log("resultsOne", resultOne);
I have tested separately that pinecone's query function works as expected and returns metadata text. However, resultOne
throws the error once I try to use the similaritySearch
function which abstracts index.query.
I found value of result
within the similaritySearch
function whilst debugging. This explains the metadata
undefined error, but the cause is unknown for now.
code:
[
{
id: "id",
score: 0,
values: [
],
metadata: undefined,
},
]
Regardless, this shows that Pinecone can return matches with undefined metadata which breaks the function.
@mayooear ah interesting, thanks a lot for debugging! We should at least definitely be a bit more defensive handling the response from pinecone
I found and fixed the problem. namespace
property is missing in the pineconeClient.query
function. Pinecone vectors have namespaces that return the metadata.
Shall I go ahead an make a pull request for "defensive handling" and fixing this bug?
@mayooear yes thank you!
@nfcampos upon further investigation and many tests, I discovered there may be a core issue with the pinecone api docs and client types.
Essentially, when a new vector is created without a namespace it doesn't seem possible to query
or fetch
it. Their api docs say "The Query operation searches a namespace, using a query vector. It retrieves the ids of the most similar items in a namespace, along with their similarity scores."
And yet, the namespace
field is optional. It appears that it should be required.
But this also means that the user should also be required to create namespaces for new vectors.
I can make another pull request to make namespaces required via Pinecone.ts, but I just wanted to get your thoughts/feedback first.
@mayooear thanks for looking into this more. from reading the pinecone docs I think namespace is optional, see When you don't specify a namespace name for an operation, Pinecone uses the default namespace name of "" (the empty string).
in https://docs.pinecone.io/docs/namespaces. I'm going to merge your PR now
Thanks for clarifying!
As per the docs and latest Pinecone library, the code below should work. However, the function
PineconStore.fromDocuments
throws an error as per below. It appears there is an issue passing the vectors to Pinecone.code:
Error log: error PineconeClient: Error calling upsert: PineconeClient: Error calling upsertRaw: RequiredError: Required parameter requestParameters.upsertRequest was null or undefined when calling upsert.