PineconeVectorStore always returns 'Empty Response'

owencraston commented 5 months ago

What is the issue?

I am trying to use pinecone db along with llama index to create a RAG application. I was following your example closely but no matter what query I pass into my BaseQueryEngine I get the same response...

Response {
  metadata: {},
  response: 'Empty Response',
  sourceNodes: [
    Document {
      id_: '9dab9a39-c8fd-4c1c-a51a-33c5277c64fd',
      metadata: [Object],
      excludedEmbedMetadataKeys: [],
      excludedLlmMetadataKeys: [],
      relationships: {},
      hash: 'hasqQpeIQjqCSkn5snCmwKG2Ty2fM1hw4IAebcnoyO8=',
      text: '',
      embedding: [Array],
      textTemplate: '',
      metadataSeparator: '\n'
    },
    Document {
      id_: 'f76c78cd-84bb-4b81-826c-feb802171347',
      metadata: [Object],
      excludedEmbedMetadataKeys: [],
      excludedLlmMetadataKeys: [],
      relationships: {},
      hash: 'zzoar6NL5v1WZM0i05r/9iT7cR3REO+AWtGeNvxcwxk=',
      text: '',
      embedding: [Array],
      textTemplate: '',
      metadataSeparator: '\n'
    }
  ]
}

Here is the code for my next.js application.

import {
  PineconeVectorStore,
  serviceContextFromDefaults,
  SimpleNodeParser, SentenceSplitter, VectorStoreIndex,
  BaseQueryEngine,
} from "llamaindex";

async function initializeChatEngine(): Promise<BaseQueryEngine> {
  const pineconeVectorStore = new PineconeVectorStore();
  console.log("Pinecone Vector Store initialized successfully. ", await pineconeVectorStore.index());
  const serviceContext = serviceContextFromDefaults({
    nodeParser: new SimpleNodeParser({
      textSplitter: new SentenceSplitter(),
      includeMetadata: true,
      includePrevNextRel: true,
    }),
  });
  const index = await VectorStoreIndex.fromVectorStore(pineconeVectorStore, serviceContext);

  return index.asQueryEngine();
}

export async function getChatEngine(): Promise<BaseQueryEngine> {
  return initializeChatEngine();
}

this function then gets called here...

import { z } from "zod";
import { createTRPCRouter, protectedProcedure } from "~/server/api/trpc";
import { Role } from "~/types/message";
import { getChatEngine } from "~/server/ai/chat/chatEngine";

// Define the expected input for the chatbotResponse mutation
const chatbotInput = z.object({
  query: z.string(),
  history: z.array(
    z.object({
      role: z.string(),
      content: z.string(),
    }),
  ),
});

export const chatRouter = createTRPCRouter({
  chatbotResponse: protectedProcedure
    .input(chatbotInput)
    .mutation(async ({ input }) => {
      const question: string = input.query;
      const chatEngine = await getChatEngine();
      const res = await chatEngine.query({
        query: question,
      });
      console.log(res);
      return { role: Role.AI, content: res.response };
    }),
});

I have confirmed that the pineconeVectorStore is correct as the log output properly describes the index I am trying to use.

System information

"llamaindex": "0.1.11",
"next": "^13.5.6"
"@pinecone-database/pinecone": "^1.1.2"
"typescript": "^5.3.3"
node -v v21.2.0

Python side

All of the data inside the pinecone index is generated in a different codebase using Llama index python inside a jupyter lab. Whats weird is that when I run

query_engine = index.as_query_engine()
response = query_engine.query("...")

it works perfectly.

Here is my code for the python data processing side.

!pip install llama-index datasets pinecone-client openai transformers pypdf python-dotenv google

from dotenv import load_dotenv
import os
from pinecone import Pinecone
from pinecone import PodSpec
from llama_index.vector_stores import PineconeVectorStore
# Load environment variables from .env file
load_dotenv()

# find API key in console at app.pinecone.io
pinecone_api_key = os.getenv("PINECONE_API_KEY")
pinecone_env = os.getenv("PINECONE_ENVIRONMENT")
# pinecone_index = os.getenv("PINECONE_INDEX")
pinecone_index_name = "party-policies"

# Instantiate class instance with your API key
pc = Pinecone(api_key=pinecone_api_key)

# Create your index (can skip this step if your index already exists)
if pinecone_index_name not in pc.list_indexes().names():
    pc.create_index(
        name=pinecone_index_name,
        dimension=1536,
        metric="euclidean",
        spec=PodSpec(
            environment='us-east-1-aws', 
            pod_type='p1.x1'
        ),
    )
    print(f"creating index: '{pinecone_index_name}'")
else:
    print(f"'{pinecone_index_name}' already exists. Here is the index description: '{pc.describe_index(pinecone_index_name)}'")

# Initialize your index 
pinecone_index = pc.Index(pinecone_index_name)

from llama_index import VectorStoreIndex, SimpleDirectoryReader, StorageContext
from llama_index.vector_stores.pinecone import PineconeVectorStore
from IPython.display import Markdown, display

# load documents
path = "./documents/"
documents = SimpleDirectoryReader(path).load_data()
print(f"Loaded {len(documents)} docs")

if "OPENAI_API_KEY" not in os.environ:
    raise EnvironmentError(f"Environment variable OPENAI_API_KEY is not set")

vector_store = PineconeVectorStore(pinecone_index=pinecone_index)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

query_engine = index.as_query_engine()
response = query_engine.query("who has the most policies about the environment?")
display(Markdown(f"<b>{response}</b>"))

Output: The New Democratic Party of Canada has the most policies about the environment.

Any ideas as to how I can fix this?

P.S. I also tried implementing a ContextChatEngine and using pinecone as the retriever but it would give answers that indicated it did not use the pinecone index at all.

owencraston commented 5 months ago

@himself65 @marcusschiesser any ideas on why this is happening?

marcusschiesser commented 5 months ago

@thucpn can you have a look at this, when you're working on https://github.com/run-llama/LlamaIndexTS/pull/297/files ?

thucpn commented 5 months ago

Hi @owencraston, I have tried using your code in nextjs app but I cannot reproduce your issue. The chatEngine work well from my side.

Response {
  metadata: {},
  response: '1. The letter must be rectangular, with four square corners and parallel opposite sides.\n' +
    '2. The letter must not be more than 11-1/2 inches long, or more than 6-1/8 inches high, or more than 1/4-inch thick.\n' +
    '3. The letter must not weigh more than 3.5 ounces.',
  sourceNodes: [
    Document {
      id_: '6e868990-8dd3-4d8b-9797-068aaf599adc',
      metadata: [Object],
      excludedEmbedMetadataKeys: [],
      excludedLlmMetadataKeys: [],
      relationships: {},
      hash: '9PQCMV/81JPkyI/LG4jp1i4BvUN965AeAtUzEz4/7ZQ=',
      text: '......'
    },
    Document {
      id_: 'ed769176-d2f3-4fc6-b092-a4b782aa5f2a',
      metadata: [Object],
      excludedEmbedMetadataKeys: [],
      excludedLlmMetadataKeys: [],
      relationships: {},
      hash: '/MVSwhojWt0eFu6FjPqM3s0rLnrx7Op2iKeb+emJSEQ=',
      text: '......'
    }
  ]
}

My system information:

"llamaindex": "0.1.11",
"next": "^14.0.3"
"@pinecone-database/pinecone": "^1.1.3"

You can have a look at my PR here to see how that chat engine work in typescript template

owencraston commented 5 months ago

@thucpn do you think there is an issue with populating the vector db in one codebase and then trying to use it in another?

owencraston commented 5 months ago

@thucpn Thank you. I have made progress based on the example PR you gave. Here is my updated code...

import {
  PineconeVectorStore,
  serviceContextFromDefaults,
  ContextChatEngine,
  VectorStoreIndex,
} from "llamaindex";

const CHUNK_SIZE = 512;
const CHUNK_OVERLAP = 20;

let chatEngine: ContextChatEngine | undefined;

async function getDataSource() {
  const serviceContext = serviceContextFromDefaults({
    chunkSize: CHUNK_SIZE,
    chunkOverlap: CHUNK_OVERLAP,
  });
  const store = new PineconeVectorStore();
  return await VectorStoreIndex.fromVectorStore(store, serviceContext);
}

async function initializeChatEngine() {
  if (!chatEngine) {
    const index = await getDataSource();
    const retriever = index.asRetriever({similarityTopK: 5});
    chatEngine = new ContextChatEngine({
      retriever,
    });
  }
  return chatEngine;
}

export async function getChatEngine() {
  return initializeChatEngine();
}

The issue I am having now is that the response the ContextChatEngine is not relevant to the context data inside the Pinecone vector store. Its giving generic responses that chat gpt would give. Contrast this with the python code above and the answers are all relevant.

Before using pinecone I was creating the store in my next.js app and storing it in memory. This resulted in great responses but the initial load took way too long and timed out my deploys (I am in a serverless environment). This is what lead me to my current approach of populating the pinecone vector db through the jupyter lab and then referencing the database in my web app.

This is my old code.

// getIndex.ts
import {
  VectorStoreIndex,
  SimpleDirectoryReader,
  serviceContextFromDefaults,
  SentenceSplitter,
  SimpleNodeParser,
} from "llamaindex";
import path from 'path';

let vectorStoreIndex: VectorStoreIndex;

async function initializeVectorStoreIndex(): Promise<VectorStoreIndex> {
  console.log('initializeVectorStoreIndex called');
  const docPath = "src/server/documents/";
  const systemDirectoryPath = path.join(process.cwd(), docPath);
  console.log('systemDirectoryPath', systemDirectoryPath);
  // Create a SimpleDirectoryReader to read all files in the directory
  const reader = new SimpleDirectoryReader();
  const documents = await reader.loadData({directoryPath: systemDirectoryPath});

  // Create a ServiceContext with default settings
  const serviceContext = serviceContextFromDefaults({
    nodeParser: new SimpleNodeParser({
      textSplitter: new SentenceSplitter(),
      includeMetadata: true,
      includePrevNextRel: true,
    }),
  });

  // Create a VectorStoreIndex from the documents
  return VectorStoreIndex.fromDocuments(documents, {
    serviceContext,
  });
}

export async function getVectorStoreIndex(): Promise<VectorStoreIndex> {
  if (!vectorStoreIndex) {
    vectorStoreIndex = await initializeVectorStoreIndex();
  }
  return vectorStoreIndex;
}

// chatEngine.ts
import {SimpleChatHistory, ContextChatEngine} from "llamaindex";
import {getVectorStoreIndex} from "~/server/ai/index/getIndex";

const chatHistory = new SimpleChatHistory();
let chatEngine: ContextChatEngine | undefined;

async function initializeChatEngine(): Promise<ContextChatEngine> {
  if (!chatEngine) {
    const vectorStoreIndex = await getVectorStoreIndex();
    chatEngine = new ContextChatEngine({
      retriever: vectorStoreIndex?.asRetriever(),
      chatHistory: chatHistory.messages,
    });
  }
  return chatEngine;
}

export async function getChatEngine(): Promise<ContextChatEngine> {
  return initializeChatEngine();
}

marcusschiesser commented 5 months ago

@owencraston I just released a new version 0.1.12. PineconeVectorStore was updated and is using the latest dependencies from Pinecone. You might wanna have a look into that.

of populating the pinecone vector db through the jupyter lab and then referencing the database in my web app In Juptyer lab, you're using Python, I assume? Currently, we don't guarantee the exchangeability of data created with Python and TS. Better try ingesting the data using TS too.

owencraston commented 4 months ago

@marcusschiesser unfortunately the update did not solve the issue.

owencraston commented 4 months ago

@marcusschiesser @thucpn something is definitely up. I re populated my pinecone index using typescript. Here is the code for that...

async function initializeVectorStoreIndex(): Promise<VectorStoreIndex> {
  try {
    console.log('initializeVectorStoreIndex called');
    const docPath = "src/server/documents/";
    const systemDirectoryPath = path.join(process.cwd(), docPath);
    console.log('systemDirectoryPath', systemDirectoryPath);
    // Create a SimpleDirectoryReader to read all files in the directory
    const reader = new SimpleDirectoryReader();
    const documents = await reader.loadData({directoryPath: systemDirectoryPath});

    const pcvs = new PineconeVectorStore();
    const ctx = await storageContextFromDefaults({ vectorStore: pcvs });

    console.debug("  - creating vector store");
    const index = await VectorStoreIndex.fromDocuments(documents, {
      storageContext: ctx,
    });
    return index;
  } catch(e) {
    console.log('initializeVectorStoreIndex failed with error:', e)
    throw e;
  }
}

^ I based this code off of your example.

Despite recreating the index in typescript, the answers to my questions are not using the Pinecone vector store as context at all. If I revert back to using the BaseQueryEngine the response is once again Empty response.

Questions

do you think this could be related to the actual source data I am using? I am using unstructured PDFs without any pre processing as my documents.
Do you guys have a working example that is using pinecone with custom embeddings? I so I can study that to see where the issue lies.
Perhaps pinecone is the issue. Everything worked perfecty when I was using the simple VectorStoreIndex. If so are there other vector databases that you know are working well?
If it is helpful I can try and create a test environment to help reproduce the issue?

david1542 commented 4 months ago

I'm also experiencing the exact same problem. I populated pinecone in a Python code and I'm trying to query pinecone from a TypeScript code. I'm also experiencing "Empty Response" with 2 empty documents.

Here is the code that populates the Pinecone DB:

pc = Pinecone(api_key=pinecone_api_key)
pc_index = pc.Index(name=organization_id)
vector_store = PineconeVectorStore(pinecone_index=pc_index)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
embed_model = OpenAIEmbedding(
    api_key=openai_api_key,
    model="text-embedding-3-large",
    dimensions=768,
)

VectorStoreIndex.from_documents(
    total_documents, # documents are retrieved using llama-hub loaders
    vector_store=pc_index,
    show_progress=True,
    embed_model=embed_model,
    storage_context=storage_context,
)

And this is the code that tries to query the DB:

const embedModel = new OpenAIEmbedding({
  apiKey: process.env.OPENAI_API_KEY as string,
  model: "text-embedding-3-large",
  dimensions: 768,
});
const pcvs = new PineconeVectorStore({
  indexName: "my-index",
});
const ctx = serviceContextFromDefaults({ embedModel });
const index = await VectorStoreIndex.fromVectorStore(pcvs, ctx);
const queryEngine = await index.asQueryEngine();
const answer = await queryEngine.query({ query: "User could not pay" });
console.log(answer.response); // Empty response;

marcusschiesser commented 4 months ago

@david1542 Currently, it's not supported to interchange data between Python and TS. We will deal with that in #564.

@owencraston Seems like you also had problems with only using TS. Sorry for the trouble, can you please test just using the retriever? Something like this:

const index = initializeVectorStoreIndex();
const retriever = index.asRetriever({ similarityTopK: 3 });
const results = await retriever.retrieve(
    query
);

Does the retrieve function return related nodes to your query?

david1542 commented 4 months ago

@marcusschiesser Thanks for the quick reply. Do you know why it is a problem? Querying with the plain Pinecone node.js library works correctly.

owencraston commented 4 months ago

@marcusschiesser following the code snippet you provided, it was able to retrieve the top 3 nodes.

[
  {
    node: TextNode {
      id_: '63630185-6597-4284-aa51-19d43c799b02',
      metadata: {},
      excludedEmbedMetadataKeys: [],
      excludedLlmMetadataKeys: [],
      relationships: [Object],
      hash: 'BGdSSSo++TU0xuctgSEXM9d4UaAoEJ1XXKlmxLRgBKg=',
      text: '',
      textTemplate: '',
      metadataSeparator: '\n',
      type: 'TEXT'
    },
    score: 0.777653277
  },
  {
    node: TextNode {
      id_: '09ee5cd8-2e8b-4d00-8c1a-a5e9353eeaf7',
      metadata: {},
      excludedEmbedMetadataKeys: [],
      excludedLlmMetadataKeys: [],
      relationships: [Object],
      hash: 'BGdSSSo++TU0xuctgSEXM9d4UaAoEJ1XXKlmxLRgBKg=',
      text: '',
      textTemplate: '',
      metadataSeparator: '\n',
      type: 'TEXT'
    },
    score: 0.773671627
  },
  {
    node: TextNode {
      id_: '9b1df84e-c804-4f39-b559-eda02c7fc2ea',
      metadata: {},
      excludedEmbedMetadataKeys: [],
      excludedLlmMetadataKeys: [],
      relationships: [Object],
      hash: 'BGdSSSo++TU0xuctgSEXM9d4UaAoEJ1XXKlmxLRgBKg=',
      text: '',
      textTemplate: '',
      metadataSeparator: '\n',
      type: 'TEXT'
    },
    score: 0.770207644
  }
]

marcusschiesser commented 4 months ago

@owencraston I see that the text of these nodes is empty, which might lead to your error. How did you load your data? Using the initializeVectorStoreIndex function mentioned above (i.e. using Typescript)? Did you use a clean pinecone index?

owencraston commented 4 months ago

yes, I used the above code. I had a few source PDFs which as I mentioned were not well structured. The index was created from scratch each time in testing.

marcusschiesser commented 4 months ago

This issue is fixed in the new 0.1.19 release. Please have a try and reopen if necessary

run-llama / LlamaIndexTS