Shivanandroy / simpleT5

simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.
MIT License
382 stars 61 forks source link

Would this work to train flan-t5-xxl on multiple GPUs? #46

Open experimarketing opened 1 year ago

emendoza2 commented 1 year ago

Bump

Shivanandroy commented 1 year ago

@experimarketing : Right now, it only works with T5 and mT5 on single GPU. I was away from the development for a couple of months. So, I didn't upgrade it to support FlanT5 and multi GPU.

But, I will integrate it ASAP.

experimarketing commented 1 year ago

Would it work with the -xxl version? I believe model parrellism would be required to run it. As it is too large to run on a single GPU.

Shivanandroy commented 1 year ago

@experimarketing : I'm afraid, It won't!

SomasekharDS commented 1 year ago

@experimarketing : Right now, it only works with T5 and mT5 on single GPU. I was away from the development for a couple of months. So, I didn't upgrade it to support FlanT5 and multi GPU.

But, I will integrate it ASAP.

Thanks for looking into this. Kindly let us know after completion. one more thing really many thanks for developing this library. It simplified the usage of the T5 model.