Open zaidalyafeai opened 4 years ago
For this task, Colab and Kaggle can only help in data preparation (may be not Kaggle if output is limited to 5 GB). For actual training, more compute resources are needed and for sometime. For example, in the last Arabic BERT dataset iteration, the hdf5 files were around 21 GB. My suggestion is to see if an individual or entity/institution has access to compute (ex. free credit on Google cloud or in-house/rented infrastructure that can be used during some time periods) - provided the output stays free to community (contribution acknowledgement granted).
I've got this from TensorFlow
"We’re happy to invite you to use up to 5 on-demand Cloud TPU v3 devices, 5 on-demand Cloud TPU v2 devices, and 100 preemptible Cloud TPU v2 devices for free for 30 days."
I think this is enough.
A collection of free TPU compute