capreolus-ir / capreolus

A toolkit for end-to-end neural ad hoc retrieval
https://capreolus.ai
Apache License 2.0
95 stars 32 forks source link

Robust04 train and test data scores #168

Closed Pourbahman closed 3 years ago

Pourbahman commented 3 years ago

Dear Andrew,

Thanks for your awesome work. I need scores like https://zenodo.org/record/3974431/files/robust04.PARADE.runs.tar.gz in @canjiali repository and I have a Quadro 6000 GPU. So, I must run your implementation on GPU.

Would you please tell me if I can evaluate those scores for both train and test data using your implementation?

If yes, would you please guide me how I can do it on your implementation?

Thanks in advance, Kind Regards, Zahra

andrewyates commented 3 years ago

Hi Zahra,

You can find instructions on obtaining runs on the test data here: https://github.com/capreolus-ir/capreolus/blob/master/docs/reproduction/PARADE.md This will not support running evaluation in the train data out of the box. You would need to make your own modifications.

However, I think it is unlikely you will be able to run the code on your GPU. A TPU or GPU(s) with 48GB RAM are really needed to run this with reasonable hyperparameters.

Andrew

Pourbahman commented 3 years ago

Thanks Andrew.

  1. If I have had several GPUs that all together can satisfy 48GB RAM of GPU( I think 2 Quadro 6000 is enough, does'nt it?), is it possible to run your code on them?

  2. If yes, how can I change your implementation to share the run between those GPUs?

andrewyates commented 3 years ago

Yes, if you have multiple GPUs you can use the Tensorflow implementation of PARADE described in the above reproduction doc. That will automatically use all available GPUs. I think you're right that 2x Quadro 6000 will not be enough though.

Pourbahman commented 2 years ago

Dear Andrew,

I checked https://github.com/capreolus-ir/capreolus/blob/master/docs/reproduction/PARADE.md but I think I asked my question in a wrong way. I try to ask it more clear.

I need the ranked list of test data and train data as a result of running your code like the ranked list can be generated using BM25 approach. Would you please guide me about it how I can generate them?

Thanks in advance, Kind Regards