In train.py test() process, almost every output for pred_answer is a single character like 'e'

sunnweiwei / AmbigPrompt

Answering Ambiguous Questions via Iterative Prompting

14 stars 2 forks source link

In train.py test() process, almost every output for pred_answer is a single character like 'e' #3

Open Wu-tn opened 5 months ago

Wu-tn commented 5 months ago

Hi, I met the question in test() process.

Wu-tn commented 5 months ago

Hi, I found another question that in inference_dense.py , faiss generate almost the same 100 passages for every question in train.json, I follow your steps in train_dense.py that install Luyu/co-condenser-wiki in hugginface and train it with the wikipedia-nq in https://github.com/luyug/Dense , I wonder which step that i make a mistake?

sunnweiwei commented 4 months ago

Hi! It's strange that the retrieval results are the same. Maybe you could try using this model (https://huggingface.co/Luyu/co-condenser-marco-retriever) to run dense retrieval inference and see if the results are normal?

Wu-tn commented 4 months ago

Hi, It is necessary to train the pre-trained co-condenser model on wikipedia-nq dataset or directly use it to encode corpus and query?

sunnweiwei commented 4 months ago

This model (https://huggingface.co/Luyu/co-condenser-marco-retriever) has been trained on MS MARCO, so can directly be used to encode the corpus and the query.

Wu-tn commented 4 months ago

Thanks, I will try it!!!

Wu-tn commented 4 months ago

By the way, it is possible to provide the 9.pt for download?