Saving trained models and their metadata for inference and reproducibility

Following discussion on Wednesday 2019-12-04 in FRNN group meeting in San Diego, we need to start systematically saving the best trained models for:

Collaboration (no need for multiple users to waste GPU hours retraining the same models)
Practical inference (@mdboyer wants a Python interface derived from performance_analysis.py that would allow a user to load a trained model and easily feed a set of shot(s) for inference, without using the bloated shot list and preprocessing pipeline that has been oriented towards training for the first phase of the project. Would enable exploratory studies about proximity to disruption, UQ, clustering, etc. This is an important intermediate step to setting up the C-based real-time inference tool in the PCS. )
Reproducibility

As a part of a broader effort towards improving reproducibility of our workflow, these models should be stored with:

.h5 file containing the tunable parameters (can be directly loaded by Keras or C-translated inference software)
Input configuration conf.yaml and/or dumped final configuration used in specifying and training the model
Output performance metrics of the trained model (train/validate/test ROC)
Normalization .npz pickled class. For VarNormalizer, this would only consist of the standard deviations of each channel of each signal from the set of shots used to train the normalizer. However, it is serialized and saved as a "fat" class object that requires the entire plasma module to load. Might want to dump a simple non-pickled array, or even .txt, alongside the pickle, so that we have a simple file to load with the Keras-C wrapper.
Some metadata about the layout of a preprocessed shot in processed_shots/signal_group_*/*.npz (order of channels and signals, sampling rates, thresholding? etc.), so that any real-time inference wrapper could apply a similar preprocessing to the incoming data.
Exact individual shot numbers used in the training, validation, and testing sets, so that anyone using the model for inference will know if the shot being supplied to the model has already been used to train the model.
SHA1 of Git commit
Conda environment; versions of dependencies such as TensorFlow, Keras, PyTorch, scikit-learn
Computer used for training, MPI library, CuDNN library, etc.
Number of devices and MPI ranks used in training (least important)

Given the binary .h5 and .npz files, we probably don't want to use VCS to store everything. But we might want to version control the plain-text metadata about the trained models. Store in this repository alongside the code? Or a new repository under our GitHub organization?

Also, should we consider ONNX?

PPPLDeepLearning / plasma-python

Saving trained models and their metadata for inference and reproducibility #41