yaoxingcheng / TLM

ICML'2022: NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework
MIT License
257 stars 21 forks source link

What GPU have you used for training TLM ? #5

Closed hitchhicker closed 2 years ago

hitchhicker commented 2 years ago

Hello,

Great work ! I am quite interested in your work.

I would like to know what kind of GPU have you used for training TLM ? From the Table 1, I see that it were 8 GPU with 42 hours. Are they 8 Nvidai V100 GPUs with 32 GB or something else ?

Looking forwar to your answer.

Thanks in advance.

yaoxingcheng commented 2 years ago

Hi~ Thanks for your attention.

Throughout all the experiments, we use NVIDIA A100 40GB GPUs, which are different from V100 32GB GPUs used in RoBERTa-Large. Note that the purpose of Table 1 is to intuitively demonstrate the difference between TLM and PLM, so the comparison is qualitative and not strictly fair. We also provide some quantitative results about computational cost in terms of FLOPs in Table 2 for your reference.

hitchhicker commented 2 years ago

Thanks for your answer, it is very clear for me.