I see that we run some number of independent starts like this
But it seems to me this just chooses the last of the independent starts, not the best ... But maybe I'm missing something in our clever analysis class. When looking at the output of some models using independent starts, I can confirm that the best validation loss is not the model being chosen for full training ...
I am about to run to a meeting, but the way this worked once was that a model only got saved to the output file if it was the best thus far. Then that model got loaded in for the final training.
I see that we run some number of independent starts like this
But it seems to me this just chooses the last of the independent starts, not the best ... But maybe I'm missing something in our clever analysis class. When looking at the output of some models using independent starts, I can confirm that the best validation loss is not the model being chosen for full training ...