Closed qiisziilbash closed 6 months ago
Hey there @qiisziilbash! Fancy seeing you here again. 🚀 Looks like you've stumbled upon another quirky challenge in the LlamaIndex universe. Hold tight, I'm diving into the abyss of code to fetch some answers for you. Will circle back shortly!
@qiisziilbash this module is called simple summarize because its very simple :)
If you need to summarize across variable number of nodes, I recommend using tree_summarize
After reviewing the details of the bug you've reported and considering the feedback from the maintainer, it seems the issue you're encountering with the truncate method
is indeed a limitation of the simple_summarize
response mode in the context of handling a large number of nodes. The behavior you're experiencing, where information is dramatically dropped or even leads to crashes due to negative chunk size, is a consequence of the simplistic approach taken by this summarization method.
The maintainer, Logan Markewich, suggests using the tree_summarize
response mode as an alternative. This mode is designed to handle summarization across a variable number of nodes more effectively. It seems to offer a more sophisticated approach to managing large datasets, potentially avoiding the issues you're facing with the simple_summarize
mode.
Given this, it might be beneficial to explore the tree_summarize
mode for your use case. This could involve adjusting your query engine configuration to utilize tree_summarize
instead of simple_summarize
. Here's a rough idea of how you might adjust your code:
index.as_query_engine(
similarity_top_k=similarity_top_k,
response_synthesizer=get_response_synthesizer(
response_mode="tree_summarize",
),
)
This change should help in managing large numbers of nodes more gracefully, potentially avoiding the truncation issue you've encountered. If you have further questions or need assistance with implementing this solution, feel free to reach out. Dosu is here to help!
totally understand; would not truncating the chunks list be simpler tho :D ? And not producing absurd results :)
@qiisziilbash probably these two steps should just be swapped in the code
truncated_chunks = self._service_context.prompt_helper.truncate(
prompt=text_qa_template,
text_chunks=text_chunks,
)
node_text = "\n".join(truncated_chunks)
Merge first, then truncate
agreed
I agree that it makes more sense to merge first and then truncate, but the merged solution introduced another problem: truncated_chunks
now contains a list with a single string in it instead of just the string, so when you pass that on to the LLM class the context_str
gets formatted like this ['{truncated_str}']
.
Bug Description
When using a query engine with a
response_synthesizer
ofsimple_summarize
response mode with too many nodes, truncate method truncates each node to very few tokens that theoretically no info is preserved (and if number of nodes are sufficiently big enough it crashes due to negative chunk size).A better behavior would be reducing number of nodes that chunks can fit into the context window or combination of this and existing behavior. ****
Version
latest
Steps to Reproduce
make a query engine using an index that has sufficiently big enough chunks and ask a question
Relevant Logs/Tracbacks
No response