Open luciusssss opened 1 year ago
Thanks for your interest in our work! Unluckly, I forgot to save my checkpoints after my Tencent internship ended. I retrained it yesterday, and here are my training logs:
Beam Retrieval with beam size 1 on MuSiQue (almost the same as the performance reported in paper).
Beam Retrieval with beam size 1 on HotpotQA (not fully trained yet, which will take four more days ).
I think it is quite convenient to train Beam Retrieval from scratch. Let me know if there is any other questions.
Thanks for your reply! I will try to train it from scratch.
Hi @canghongjian ! Were you able to retrain your model? Would it be possible for you to upload the checkpoints?
Hi @canghongjian ! Were you able to retrain your model? Would it be possible for you to upload the checkpoints?
Sure. I will upload the checkpoints later.
Hi all, I have uploaded the large version of Beam Retrieval on HotpotQA (the version we submitted to the leaderboard): hotpot_beam1_retr.pt hotpot_beam2_retr.pt
Hopefully they can do some help.
Hi @canghongjian thanks for open-sourcing the model weights and code, however when I try to run inference for hotpotqa, i get the following error:
OSError: model/deberta-v3-large is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`
Hence, I replace 'model/deberta-v3-large' to 'microsoft/deberta-v3-large' in test_model_tmp.py But then I get the below error:
RuntimeError: Error(s) in loading state_dict for Retriever:
Unexpected key(s) in state_dict: "encoder.embeddings.position_ids".
Please let me if anything is incorrect. Thanks!
Hi @canghongjian thanks for open-sourcing the model weights and code, however when I try to run inference for hotpotqa, i get the following error:
OSError: model/deberta-v3-large is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models' If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`
Hence, I replace 'model/deberta-v3-large' to 'microsoft/deberta-v3-large' in test_model_tmp.py But then I get the below error:
RuntimeError: Error(s) in loading state_dict for Retriever: Unexpected key(s) in state_dict: "encoder.embeddings.position_ids".
Please let me if anything is incorrect. Thanks!
Hi @sauravjoshi23, sorry for the late reply. 'model/deberta-v3-large' is a placeholder, which you can edit by the actual url of local model files or a remote url. In terms of the second error, it seems to be caused by the version of the transformers library. Our transformers library version is pip install transformers==4.28.0
Hi all, I have uploaded the large version of Beam Retrieval on HotpotQA (the version we submitted to the leaderboard): hotpot_beam1_retr.pt hotpot_beam2_retr.pt
Hopefully they can do some help.
Hello! Thank you for sharing the checkpoint. I would like to ask if you could provide the code or algorithm you used to process the dataset? From what I saw in your code and paper, the dataset is processed at the passage-level, but the evaluation for the HotpotQA leaderboard is at the sentence-level. And the threshold you set?
@Barianc This part of the code is in qa/datasets.py
, and you should train a reader to handle this information. We use a predefined hop according to the datasets and did not use a threshold when submitted the results to the leaderboard.
Hi! Thanks for open-sourcing the codes for Beam Retrieval. Do you have any plans to share the checkpoints of the models? Thanks a lot!