Is there any way to get the acc, loss etc for each epoch of the training?

autonomio / talos

Hyperparameter Experiments with TensorFlow and Keras

https://autonom.io

MIT License

1.62k stars 268 forks source link

Is there any way to get the acc, loss etc for each epoch of the training? #328

Closed JohanMollevik closed 5 years ago

JohanMollevik commented 5 years ago

Is there any way to get the acc, loss etc for each epoch of the training?

I was searching for that feature in the documentation but could not find it.

My use case for the feature is that I was running a rather long Scan and at the end I saw clear signs of over-training (acc much higher than val_acc). What I had hoped to do was look at the history of those two parameters to figure out where the over-training starts to later re run them.

Dos this feature exists and if so how? If not, should this be added to the feature roadmap?

mikkokotila commented 5 years ago

Hey, good to see you around :)

You can get it through:

scan_object.round_history

In there you'll get the keras history object for each processed permutation.

renatobellotti commented 5 years ago

That's good to know.

Is there a persistent way to do this? I'm running some Keras models on a cluster. The job needs many cores in order to speed up training. However, I don't want to have my job running for days just to have access to all histories...

Can the histories be logged somewhere?

renatobellotti commented 5 years ago

Of course, one can always save the histories in the end, but everything that can be done by the library means more productivity for the users of the library.

mikkokotila commented 5 years ago

I have been thinking about building an interactive real-time monitoring dashboard with dash-plotly, been experimenting with that with good success so far. Let's see if that happens soon, but definitely in there you would have access to history for all permutations during the experiment.

mikkokotila commented 5 years ago

Can you create a new issue for:

Can the histories be logged somewhere?

I'm closing here as it's resolved.