Hello, we are trying to reproduce your setup but we would like to know what is your training loss final value? I see you have two losses, one with cross entropy and one with MSE. It would help us a lot if we could see the training loss on cross entropy. Or at least what is the cross entropy value you have at the end of training? and after 10k gradient updates?
Hello, we are trying to reproduce your setup but we would like to know what is your training loss final value? I see you have two losses, one with cross entropy and one with MSE. It would help us a lot if we could see the training loss on cross entropy. Or at least what is the cross entropy value you have at the end of training? and after 10k gradient updates?