Closed mariosconsta closed 1 year ago
Hello, It is normal value of loss for train_baseline.py
script, because loss do not have any type of normalization.
In my own experiments, I use normalization by number of heads and got losses like bellow for SHA dataset.
Hello, It is normal value of loss for
train_baseline.py
script, because loss do not have any type of normalization. In my own experiments, I use normalization by number of heads and got losses like bellow for SHA dataset.![]()
Yes I've read this on another issue. I will normalize it as well and see the output, thank you!
I get these values for loss while training the model from scratch on the JHU dataset. It started from ~10,000 and it is dropping slowly but I feel that this number is not right. In the code the for training_baseline the loss used is just MSE and not the one purposed in the paper, shouldn't this loss be in the range of 0 and 1?