diux-dev / cluster

train on AWS
75 stars 15 forks source link

ImageNet: spikes in training loss once per epoch #43

Closed yaroslavvb closed 6 years ago

yaroslavvb commented 6 years ago

Seems like a bug in cross entropy calculation due to uneven batch size, spike happens once per epoch

screenshot 2018-08-02 14 26 11

https://github.com/diux-dev/cluster/blob/master/pytorch/recipes-yaroslav-va.ipynb

bearpelican commented 6 years ago

Spikes seem to have been resolved in tensorflow. Closing until it reappears