How to fit T5-11B(and T5-3b) into a v3-8 TPU

allenai / unifiedqa

UnifiedQA: Crossing Format Boundaries With a Single QA System

https://arxiv.org/abs/2005.00700

Apache License 2.0

428 stars 43 forks source link

How to fit T5-11B(and T5-3b) into a v3-8 TPU #30

Closed Sanqiang closed 3 years ago

Sanqiang commented 3 years ago

I tried to reproduce the model with A100 (40GB) GPU, but it cannot fit T5-3B/11B without model parallel or Deepspeed? I am wondering how you fit the large model into TPU? are you using half precision (fp16) or something else?