Hi there.
I run the training via colab (I have a 16GB GPU memory).
I'm using concat.none.jsonnet for BERT and getting a "CUDA out of memory" error at 54% of epoch 0
I would like to know the amount of memory needed to be able to launch the training based on BERT or if there is a way to do it with the 16GB
Thanks
Hi there. I run the training via colab (I have a 16GB GPU memory). I'm using concat.none.jsonnet for BERT and getting a "CUDA out of memory" error at 54% of epoch 0 I would like to know the amount of memory needed to be able to launch the training based on BERT or if there is a way to do it with the 16GB Thanks