tensorflow / swift-models

Models and examples built with Swift for TensorFlow
Apache License 2.0
646 stars 146 forks source link

Add logging callbacks for training statistics #663

Open BradLarson opened 4 years ago

BradLarson commented 4 years ago

The training loop abstraction has a visual progress bar for tracking statistics during training. In many cases, it's desirable to log or store training statistics for later analysis, or to send those statistics to different services for visualization.

Further discussion is present in PR #662, but we'd like to have callbacks such as the following:

as well as a general mechanism for writing more.

A TrainingStatistics callback exists now as a means of collecting statistics, but it is only directly called by TrainingProgress. The idea is to have a single statistics gathering function, so as to avoid the overhead of materializing statistics-related tensors multiple times, and to have that report those statistics to all interested parties. For this, we'd need to have one TrainingStatistics instance that acts as a kind of notification center to other callbacks.

We'd want this implementation detail to be hidden from end-users, so that they just need to specify which logging or visualization callbacks they need, and not have to worry about managing the TrainingStatistics instance itself.

vkuznet commented 4 years ago

+1