huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.86k stars 26.49k forks source link

Have no GPU to train language modelling #693

Closed khaerulumam42 closed 5 years ago

khaerulumam42 commented 5 years ago

Sorry I open this issue, is not issue of this repository.

I very appreciate what the authors created this repository, help us to more understand how BERT works and implement on several tasks.

So I have a problem with training because I have not GPU to train language modelling, I have Indonesian dataset (about 2GB) that trainable for language modelling using this repo, could anyone help me to train this dataset? If you could help, you have permission to open source or use the model trained.

I hope it will be give more models provided and to make NLP community more interest in latest NLP models especially Indonesian.

You can email me directly on khaerulumam42@gmail.com or comment below.

Thank you very much

oliverguhr commented 5 years ago

You can train a tensorflow model using google colab for free. After training it, you can convert your tf model to pytorch.

Oxi84 commented 5 years ago

Or use 300 usd credit for google cloud, that you get when you signup i believe.

khaerulumam42 commented 5 years ago

Thank you @oliverguhr and @Oxi84 for suggestions.

I have tried both methods, using google colabs and GPU as runtime processor, it took about 240hours for every epoch (maybe if I use apex, it will be faster but I think still hundreds of hours), i think it's impossible to run google colabs dozens of days.

I got free trial for GCP, unfortunately Google not provide GPU for free trial version. I try training use GCP with 2CPU and 13GB RAM, it take 200thousands of hours training, is sooo long time.

Maybe I should reduce corpus size?

Thanks

Oxi84 commented 5 years ago

Or smaller vocabulary. I am pretty sure you can even use TPU on Google cloud, let someone else confirm that.

oliverguhr commented 5 years ago

@Oxi84 For my classification task, I noticed that training the model with just 40 mb of data will give me already pretty good results. Training with the full 1,5 GB of my dataset improves the results by just 2-3% accuracy. So you might start with a (random) subset of your data and improve the size step by step and see if your scores get better.

khaerulumam42 commented 5 years ago

Oh nice insight @oliverguhr , thank you. I will try to reduce training data and train.