Closed suraj-gade closed 1 year ago
Embeddings are generated for all the documents during index construction
At query time, only the query text is embedded, and then the query + relevant nodes are sent to the LLM to make a response
Hi @jerryjliu, @logan-markewich Thanks for the response.
In the llama_index documentation here , it says that for List Index, the embeddings are generated during query() and not during index construction.
Actually my goal is to generated the embeddings during index construction, assuming it will reduce the inference time during query.
Please have a look and let me know your thoughts.
@suraj-gade ah I missed that you were using a list index. You likely be more interested in using GPTVectorStoreIndex
Hi,
Below is the code that I am using to do inference on Fastchat LLM.
Here the "data" folder has my full input text in pdf format, and am using the llama_index and langchain pipeline to build the index on that and fetch the relevant chunk to generate the prompt with context and query the FastChat model as shown in the code.
I want to understand when does llama_index generate the embeddings for the input text from the "data" folder. is it at the time of indexing
new_index = GPTListIndex.from_documents(documents, service_context=service_context)
the embeddings are generated for all the nodes/chunks in the input text of document or at the time of queryquery_engine.query("sample query question?")
when the relevant chunk/node is to be fetched with similar embeddings as that of input prompt.Please help me understand at what point does llama_index generate the embeddings.