Allow GPU parallelization to take advantage of multi-GPU instances

beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

http://beir.ai

Apache License 2.0

1.61k stars 192 forks source link

Allow GPU parallelization to take advantage of multi-GPU instances #21

Closed joshdevins closed 3 years ago

joshdevins commented 3 years ago

I'm running evaluate_anserini_docT5query.py and it's currently only able to utilize a single GPU. I'm using GCP instances with 4x V100 GPUs and I'd like to finish more experiments in less time by using more GPUs. Is there a simple way to parallelize batches or configure using multiple GPUs that I'm not aware of?

thakur-nandan commented 3 years ago

Hi @joshdevins,

Nice question! I, unfortunately, don't know how to implement this. I searched online but couldn't find a direct solution to generate using multi-GPUs. I will keep searching if I find something relevant will get back to you.

If you have suggestions on implementation, I would happy to have a look at them!

Unfortunately for now I can only suggest hosting the individual question-generation models separately in the GPUs.

Kind Regards, Nandan

joshdevins commented 3 years ago

Thanks @NThakur20. The obvious place for me would be to add data parallelization. So the corpus is chunked right now (I think into size 80 examples?), and we could easily parallelize that loop with a multiprocessing Pool of size n (one per GPU). When we call generate we can also pass a GPU/core number and CUDA can use that GPU as specified. I can take a stab at that one script for doc2query and any downstream changes, and maybe it's applicable to processing other datasets at scale as well.

joshdevins commented 3 years ago

So here we can pass a GPU number and make n models, one per GPU https://github.com/UKPLab/beir/blob/8d8061934e2c33374c075412da2093a5f403521a/beir/generation/models/auto_model.py#L6

joshdevins commented 3 years ago

Last thought, I'm not familiar enough with Transformers+PyTorch innards but there should be a native way to do data parallel inference since we provide batches already. For example https://github.com/huggingface/transformers/issues/3936

thakur-nandan commented 3 years ago

Thanks for the pointers @joshdevins, I will have a look at them!

Kind Regards, Nandan

thakur-nandan commented 3 years ago

Hi @joshdevins,

Finally, after having no success with DataPallalel, I'm happy to share that I have added support now using multiprocessing to generate multiple process pools which are able to take advantage of multiple GPUs. Hope it helps!

Here is a sample code on how to use it: https://github.com/UKPLab/beir/blob/development/examples/generation/query_gen_multi_gpu.py

I hope now you are able to utilize multiple GPUs now for a faster generation. The updates are yet in the development branch. I would suggest if it's urgent, you can clone the development branch and start working. I would soon push the changes to master and update the PyPI version.

Kind Regards, Nandan