[Question]: Excessive content returned by the Chat engine, How can I get it to not match so much?

run-llama / llama_index

LlamaIndex is a data framework for your LLM applications

https://docs.llamaindex.ai

MIT License

35.44k stars 5k forks source link

[Question]: Excessive content returned by the Chat engine, How can I get it to not match so much? #13965

Closed yuyu990116 closed 4 days ago

yuyu990116 commented 3 months ago

Question Validation

[X] I have searched both the documentation and discord for an answer.

Question

chat_engine = index.as_chat_engine( chat_mode='condense_plus_context', use_async=True, system_prompt=SYSTEM_PROMPT, ) The returned content is too long, how can I make it return only the top1 or 2 matches?

logan-markewich commented 3 months ago

It's already the top 2 by default? 👀

yuyu990116 commented 3 months ago

It's already the top 2 by default? 👀

but through verbose=True, I saw 8 matched results (followed file_path: / / )

logan-markewich commented 3 months ago

Ah. I guess I assumed you used a vector index, but what kind of index did you create?

yuyu990116 commented 3 months ago

Ah. I guess I assumed you used a vector index, but what kind of index did you create?

DocumentSummaryIndex

and this is how I used it: storage_context = StorageContext.from_defaults(persist_dir=persist_dir) index = load_index_from_storage(storage_context)

yuyu990116 commented 3 months ago

Ah. I guess I assumed you used a vector index, but what kind of index did you create?

DocumentSummaryIndex

and this is how I used it: storage_context = StorageContext.from_defaults(persist_dir=persist_dir) index = load_index_from_storage(storage_context)

This is how I created the index: splitter = SentenceSplitter(chunk_size=1024) response_synthesizer = get_response_synthesizer( response_mode="tree_summarize", use_async=True ) doc_summary_index = DocumentSummaryIndex.from_documents( docs, llm=OurLLM(), #customed local model transformations=[splitter], response_synthesizer=response_synthesizer, show_progress=True, ) doc_summary_index.storage_context.persist(persist_dir)

logan-markewich commented 3 months ago

@yuyu990116 the document summary index works by generating summaries of documents, and using those to decide which documents to send to the LLM

This will send ALL nodes associated with the chosen documents to the LLM.

There isn't really a way to limit the number of nodes sent with this index, since its pulling all nodes from the selected document