Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
I wonder if there is any strategy to empty cache after loading (Restoring) gpt2 parameters into our model? I am getting OOM error for batch_size and I wonder how if there is anyway to get around that when using massive pre-trained models?
I wonder if there is any strategy to empty cache after loading (Restoring) gpt2 parameters into our model? I am getting OOM error for batch_size and I wonder how if there is anyway to get around that when using massive pre-trained models?