Open VvVLzy opened 2 years ago
Could you show a minimally reproducible script with some dummy data? @VvVLzy
Just to clarify: are you asking for the script I used for training as well as the two sets of training data?
Yes, it would be helpful for checking where is the problem. It does not happen on my machines
Here are the scripts and dummy data for the slow and fast training (each in the corresponding folder). The slower training (without stress) uses monty to load/parse the data file (so the data file format is a bit different). However, the parsed data fed into the trainer is of the same format, so it should not affect training speed in that regard...
Both data files consist of 1000 training examples and 100 validation.
Thanks.
I have been using two datasets to train the model based on the pre-trained one. They are pretty similar in size, one without stress and the other with stress.
I notice that, using the same device configuration, the model trains much slower on the dataset without stress. It even runs out of memory after 2 epochs when using
batch_size=32
. I have to decrease the batch size to 16 to continue training.The training speed for the dataset with stress is ~130ms/step with batch size of 32. The training speed for the dataset with stress is ~270ms/step with batch size of 16.
I wonder what might be causing this factor of 4 slower in speed?