DigitalPhonetics / IMS-Toucan

Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.
Apache License 2.0
1.17k stars 135 forks source link

How is the dataset split during training ? #128

Closed Ca-ressemble-a-du-fake closed 1 year ago

Ca-ressemble-a-du-fake commented 1 year ago

Hi,

Every time I train a model I look at the progress and wonder how the number of steps per epoch is computed.

For example my dataset has 98 datapoints and I train with a batch size of 32. So there should be 4 steps per epoch (3 full batches + 1 batch with 2 datapoints). But in that case Toucan displays 3. And more generally Toucan always displays one step less than what I compute.

Why is it or where am I mistaken ?

Thanks in advance for your explanation :smiley:

Flux9665 commented 1 year ago

The dataloader drops the last batch if the batch is not full. So it only serves 3 full batches and then shuffles the data again and starts over, without serving the batch with just 2 elements. This is important because many components in a model are sensitive to the batchsize, like the BatchNorm layer. So all batches should contain the same number of samples.

Ca-ressemble-a-du-fake commented 1 year ago

Ok got it, thanks for this explanation !