Not able to reproduce the results for the GTR-Base model

google-research / t5x_retrieval

Apache License 2.0

95 stars 9 forks source link

Not able to reproduce the results for the GTR-Base model #2

Open lhbonifacio opened 1 year ago

lhbonifacio commented 1 year ago

Hi,

Recently I fine-tuned a GTR-Base molde following the recommended steps, using the BEIR MS Marco dataset. However, after fine-tuning as instructed, the results are very far (in some datasets) from those reported on the paper.

I have evaluated 6 datasets from the BEIR Benchmark and for all of them, the results are bellow the ones reported on the paper.

Did someone notice or report the same issue?

Thanks

nijianmo commented 1 year ago

Hi, as mentioned in the paper, we are using the training data from https://github.com/PaddlePaddle/RocketQA/tree/main/research/RocketQA_NAACL2021#download-data, which is different from the BEIR MS Marco dataset. You may want to try this training data instead.

jeffzwang commented 1 year ago

Follow up question - are the released weights the ones that achieve the BEIR results reported?