Open appleleaves opened 4 years ago
From a traditional optimization perspective, you might want to terminate the optimization when your energy is lower than some epsilon.
But in many cases, it is impractical to do so for DNN because the loss isn't always expected to become zero. The most common approach is to use a validation dataset to evaluate your training progress.
We use a standard training strategy, which is to train the network for a fixed number of iterations with decreasing-to-zero learning rate.
After reading your paper, I understand that the loss can be lower than 0. However, will this property, when do I know that the training process has come to an end? Or how do I know the trained model has a "small" loss?