Closed vkehfdl1 closed 9 months ago
I am going to use DSTC11 Track 5 dataset to evaluate and training classifier for this task. I already uploaded preprocessed dataset to NomaDamas huggingface datasets.
Commonly, it is called Knowledge-Seeking Turn Detection. Plus, it uses simple classifier based on BERT. So, it might need various model for specific task of domains.
There are two tasks in this issue. First is, Knowledge-Seeking Turn Detection. It only reads dialogue history that ai and user did. Second is also Knowledge-Seeking Turn Detection, but it can access current retrieved knowledge. So, you can decide with not only dialogue history, but also knowledge contents. It is crucial because major RAG system needs to detect search turn, and if knowledge is already fullfilled, you don't need to search again even if user's last question is definitely knowledge-seeking turn.
In conclusion, DSTC11-Track5 is not 'the dataset' that I wanted for the latter task. I don't know how to find latter tasks, but I will try to find this. Dialouge + knowledge ...
I think it would be efficient that just evaluate LLM's answer after generation with given passages first, and if the answer is not met for quality threshold, re-search knowledge and generate answer again. Because, you can just use LLM only once when the answer is quite great. Plus, if you try to detect knowledge-seeking turn, you have to always use that module, even it is okay to generate answer directly. And detection is always have incorrect detection and it needs data for fine-tuning detector. So, I think we can just use 'Check Facts and Try Again' Framework?
But I don't know what system can get more accuracy in real data... Maybe I implemented all and benchmark it?
In #342, you can use RL model to decide which next move to be. I think there might be alternative methods for deciding to search for another passages or use retrieved passages.