canghongjian / beam_retriever

[NAACL 2024] End-to-End Beam Retrieval for Multi-Hop Question Answering
https://arxiv.org/abs/2308.08973
Apache License 2.0
81 stars 8 forks source link

Request for checkpoints #1

Open luciusssss opened 1 year ago

luciusssss commented 1 year ago

Hi! Thanks for open-sourcing the codes for Beam Retrieval. Do you have any plans to share the checkpoints of the models? Thanks a lot!

canghongjian commented 1 year ago

Thanks for your interest in our work! Unluckly, I forgot to save my checkpoints after my Tencent internship ended. I retrained it yesterday, and here are my training logs:

  1. Beam Retrieval with beam size 1 on MuSiQue (almost the same as the performance reported in paper).

    image
  2. Beam Retrieval with beam size 1 on HotpotQA (not fully trained yet, which will take four more days ).

    image

I think it is quite convenient to train Beam Retrieval from scratch. Let me know if there is any other questions.

luciusssss commented 1 year ago

Thanks for your reply! I will try to train it from scratch.

schlabrendorff commented 1 year ago

Hi @canghongjian ! Were you able to retrain your model? Would it be possible for you to upload the checkpoints?

canghongjian commented 1 year ago

Hi @canghongjian ! Were you able to retrain your model? Would it be possible for you to upload the checkpoints?

Sure. I will upload the checkpoints later.

canghongjian commented 7 months ago

Hi all, I have uploaded the large version of Beam Retrieval on HotpotQA (the version we submitted to the leaderboard): hotpot_beam1_retr.pt hotpot_beam2_retr.pt

Hopefully they can do some help.

sauravjoshi23 commented 7 months ago

Hi @canghongjian thanks for open-sourcing the model weights and code, however when I try to run inference for hotpotqa, i get the following error:

OSError: model/deberta-v3-large is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

Hence, I replace 'model/deberta-v3-large' to 'microsoft/deberta-v3-large' in test_model_tmp.py But then I get the below error:

RuntimeError: Error(s) in loading state_dict for Retriever:
    Unexpected key(s) in state_dict: "encoder.embeddings.position_ids". 

Please let me if anything is incorrect. Thanks!

canghongjian commented 7 months ago

Hi @canghongjian thanks for open-sourcing the model weights and code, however when I try to run inference for hotpotqa, i get the following error:

OSError: model/deberta-v3-large is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

Hence, I replace 'model/deberta-v3-large' to 'microsoft/deberta-v3-large' in test_model_tmp.py But then I get the below error:

RuntimeError: Error(s) in loading state_dict for Retriever:
  Unexpected key(s) in state_dict: "encoder.embeddings.position_ids". 

Please let me if anything is incorrect. Thanks!

Hi @sauravjoshi23, sorry for the late reply. 'model/deberta-v3-large' is a placeholder, which you can edit by the actual url of local model files or a remote url. In terms of the second error, it seems to be caused by the version of the transformers library. Our transformers library version is pip install transformers==4.28.0

Barianc commented 2 months ago

Hi all, I have uploaded the large version of Beam Retrieval on HotpotQA (the version we submitted to the leaderboard): hotpot_beam1_retr.pt hotpot_beam2_retr.pt

Hopefully they can do some help.

Hello! Thank you for sharing the checkpoint. I would like to ask if you could provide the code or algorithm you used to process the dataset? From what I saw in your code and paper, the dataset is processed at the passage-level, but the evaluation for the HotpotQA leaderboard is at the sentence-level. And the threshold you set?

canghongjian commented 2 months ago

@Barianc This part of the code is in qa/datasets.py, and you should train a reader to handle this information. We use a predefined hop according to the datasets and did not use a threshold when submitted the results to the leaderboard.