IBM / multidoc2dial

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents
Apache License 2.0
67 stars 22 forks source link

Error in running converter #1

Closed sutakori closed 2 years ago

sutakori commented 2 years ago

Hi, thank you for your impressive dataset. I encounted some questions when running DPR converter.

I have downloaded the checkpoint "checkpoint.retriever.single.nq.bert-base-encoder" from the DPR official repo, and encounted missing key error when running run_converter.sh. It seems that the only difference between "convert_dpr_original_checkpoint_to_pytorch.py" in repo and in huggingface DPR is that

key = key.replace("bert_model.encoder", "bert_model")

and that causes the code unwork.

So does this line of code indeed unnecessary, or there are mistakes in my usage?

sivasankalpp commented 2 years ago

Hi @sutakori, thank you for pointing this out! We added this line because we used a modified version of the DPR code to finetune the DPR encoder. You are right that we should remove that line for checkpoint.retriever.single.nq.bert-base-encoder checkpoint. We have added the fix.