clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition
MIT License
1.02k stars 272 forks source link

Different batch sizes will yield different performance results. #130

Closed rezimitpo closed 2 years ago

rezimitpo commented 2 years ago

Different batch sizes in the inference stage lead to different results.

When I traced it, I saw that the values ​​change in a particular convolution.

Perhaps the reason you wrote the code that way is because the results are different.

Can't we solve this?

It doesn't seem to be a problem with your code, and it looks like there is something wrong with pytorch internally.(or cudnn)

ZhaZhaFon commented 2 years ago

Different batch sizes in the inference stage lead to different results.

When I traced it, I saw that the values ​​change in a particular convolution.

Perhaps the reason you wrote the code that way is because the results are different.

Can't we solve this?

It doesn't seem to be a problem with your code, and it looks like there is something wrong with pytorch internally.(or cudnn)

batch_size has influences on the performance indeed. It normally brings a little fluctuation. This happens to most deep learning models as far as I know.

Jungjee commented 2 years ago

There are ways to enable this (e.g., gradient accumulation) by simulating large batch-sizes. However, currently there is no plan to support this feature.