facebookresearch / vissl

VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
https://vissl.ai
MIT License
3.24k stars 330 forks source link

torch_xla compatibility option? #555

Open adamcatto opened 2 years ago

adamcatto commented 2 years ago

🚀 Feature

An option to train models on a TPU or TPU pod using the torch_xla package.

Motivation & Examples

Motivation: speed up training, utilize best available resources.

Example: in vissl/vissl/trainer/trainer_main.py, start by changing SelfSupervisionTrainer.setup_distributed(self, use_gpu) to something like SelfSupervisionTrainer.setup_distributed(self, device), then encapsulate TPU training setup if device == 'TPU', or something along these lines. Relevant changes to other functions can be made afterwards.

(Note: I will likely start working on this; I am new to VISSL, so I figure a regular contributor might be better-equipped to handle this, but I can give it a go nonetheless.)

QuentinDuval commented 2 years ago

Hey @adamcatto,

Thanks a lot for raising the point :)

So to be fair, we did have a look last year at PyTorch XLA to see if we could get something out of it, but did not do so for several reasons: PyTorch/XLA was still relatively new and at that time we trained ConvNets on which GPUs are actually pretty good. But now that Visual Transformers are there in the codebase, it might indeed be worth looking into.

I am however for the moment not qualified enough with PyTorch/XLA and the TPU ecosystem to proceed with such changes (I understood that to run on TPU, it's more than just changing the device, the data loader, as well as other things such as the way to save the model or fetch data, or even just run jobs on GCP would have to be integrated). It is however part of my personal goals to play with those technologies, so that might change.

If you feel qualified on this, we can start to discuss what would need to be changed, what kind of test case you would like to move forward first, etc.

What do you think? Quentin