Remove any "thanks for more context" in LLM responses after refinement

rchan26 commented 1 year ago

By default, llama-index will try to get the LLM to refine its answer by providing more context and its previous answer. If it does this, it sometimes gives a thanks to having more context, e.g. saying "Thank you for providing additional context!" or "Thank you for providing more context!" at the start of it's response.

Another related issue is that sometimes the additional context during refinement is not useful, and the LLM will mention that it was not useful and the original answer stands.

This could be confusing for users (as they don't know a refinement is happening) as this gets done by llama-index on the fly. Either we figure out a way to remove these kind of acknowledgements from the response automatically, or we ensure that we don't try to refine. The later is probably easier to do (maybe increasing chunk_size_limit argument in the ServiceContext will ensure this), but potentially harmful to the model if it turns out that refining is actually beneficial and better than just having a larger chunk size limit to begin with.

rchan26 commented 1 year ago

Might be useful to dig into the different kinds of response modes that llama-index has: https://gpt-index.readthedocs.io/en/stable/core_modules/query_modules/response_synthesizers/usage_pattern.html

Maybe "simple_summarize" is something to explore which truncates all text chunks to fit into a single LLM prompt. I think in this case, we can make set chunk_overlap_ratio=0 as we're going to fit it all in a single call.

rwood-97 commented 1 year ago

Could try adding "Never say thank you, that you are happy to help, that you are an AI agent, etc. Just answer directly." to the system prompt.

rchan26 commented 1 year ago

This is a good idea!

From playing around with different models, it feels like better (typically this means larger) models tend to not really do this. The default prompt from llama-index does say to return the original answer if the new context is not useful. It seems like larger models will not say thanks.

alan-turing-institute / reginald

Remove any "thanks for more context" in LLM responses after refinement #68