Closed github-actions[bot] closed 3 years ago
@tttthomasssss Just for your reference, I had described here what could be the possible causes.
Error type: GPU running out of memory when loading training data
I reproduced the error by doing the following:
I created a virtual machine on google cloud with the following specs:
n1-highmem-2 (2 vCPUs, 13 GB memory)
with 1 x NVIDIA Tesla P4
to match our CI set up.
I installed the latest rasa version==2.6.3
I cloned the repository https://github.com/RasaHQ/training-data
I ran the command: rasa train nlu -u <dataset> -c <config>
and got the error:
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[252,223,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
NVIDIA Tesla K80
which has 12 GB of memory (1 x NVIDIA Tesla P4
only has 8 GB).training.yml
file. I took the longest utterance in training.yml
and translated it from German to English. I noticed that the utterance was duplicated three times creating one very long utterance that's 1374 characters long (probably too large to load into the GPU memory). I shortened the utterance by deleting the duplication. I then reran the training command on the virtual machine in (1) and it completed with no errors. I also ran the test: rasa test nlu -u <dataset>
which also ran with no errors.
After the test passed on google cloud, I ran it in our CI framework by submitting an empty pull request: https://github.com/RasaHQ/rasa/pull/8849 and the test passed.
~@samsucik assigned as reviewer.~ Not pull request to review.
Original estimate: 4 effort point == full sprint. Actual time: one and a half day of work (excluding the time it took me to get setup on google cloud and read the CI framework documentation)
This PR is automatically created by the Scheduled Model Regression Test workflow. Checkout the Github Action Run here.
---
Description of Problem:
Scheduled Model Regression Test failed.
Configuration: BERT + DIET(seq) + ResponseSelector(t2t)
Dataset: Private 2
Definition of done