"developer-mode" to access quickly the results of document retrieval to have better picture of vector distance and possible vector clustering problem. E.g. how to make sure questions about a specific ROOT class provide the url of the class reference and not some other script that uses the class a lot.
Do not provide context if score is too low.
Investigate token length problem and chunk length.
Ideas to implement a smart memory management for prolonged conversations. Various options from langchain. Architecture with short and long term memory.