In the redesign of crema, model components will be self-contained estimators.
Package layout
crema/ # the package that you import
RESOURCES/ # where model parameters and other pickles are located
submodules... # where submodule and estimator code lives
training/ # where model development (training, eval) scripts live
tests/ # unit tests
docs/ # documentation
Estimator API
Each estimator will have its own Pump object, that handles the audio/model/jams interface.
Each estimator will produce exactly one annotation object for one input waveform.
Within each estimator, the corresponding model will be loaded from a pre-built resource file. These will be defined as keras h5 files.
Each estimator will implement a predict method that maps audio to JAMS annotations. We can alias this to __call__ for a streamlined interface.
Each estimator may optionally implement a transform method that produces a dictionary of {feature_name: features}, where features is a numpy array.
Each estimator will be independently versioned. This information may be best stored within the model h5 resource. Each estimator will be responsible for constructing its own annotation_metadata.
Estimators will be instantiated as singleton objects upon import, so that you can do the following kind of thing:
For any packaged version of crema, we will have a manifest of the included models. These will all be instantiated upon import at the top-level module. This will allow us to have a top-level analyze function that can produce an entire JAMS object for an input track:
For each estimator that requires a pre-trained model, the training scripts will be stored under training/ESTIMATOR/ and have the following filename convention:
requirements.txt : requirements file for training this model. This should only be used for helper modules to facilitate training (eg, muda or pescador), and cannot be required for test-time prediction.
index_train.json : a json file listing the (relative) paths to training data as (audio, jams) pairs. Must be parse-able into a pandas dataframe.
index_test.json: like above, but for testing data.
README.md: description of the model architecture, parameters, training strategy, etc.
00-setup.py : any necessary preliminary processing. This includes things like pre-computed data augmentation.
loads index_train.json, working path, and pump object
saves the resulting model as resources/model.h5
tracks the version number of the model.
03-evaluate.py : testing
loads the prebuilt model and test data index
cannot rely upon pre-computed features: testing must be end-to-end
calls the appropriate mir_eval function on each estimate
stores the resulting score array as resources/test_data.json
Other conventions:
All training scripts should seed all random number generators.
cached features should be locatable from the training index file, either by name or row number. The learning curriculum is up to the training script, so whatever makes the most sense there is fair game.
the above goes for data augmentation as well, using the track_id.augment_id.ext convention.
In the redesign of crema, model components will be self-contained estimators.
Package layout
Estimator API
Each estimator will have its own
Pump
object, that handles the audio/model/jams interface.Each estimator will produce exactly one annotation object for one input waveform.
Within each estimator, the corresponding
model
will be loaded from a pre-built resource file. These will be defined as kerash5
files.Each estimator will implement a
predict
method that maps audio to JAMS annotations. We can alias this to__call__
for a streamlined interface.Each estimator may optionally implement a
transform
method that produces a dictionary of{feature_name: features}
, wherefeatures
is a numpy array.Each estimator will be independently versioned. This information may be best stored within the model
h5
resource. Each estimator will be responsible for constructing its ownannotation_metadata
.Estimators will be instantiated as singleton objects upon import, so that you can do the following kind of thing:
Analyzer API
For any packaged version of crema, we will have a manifest of the included models. These will all be instantiated upon import at the top-level module. This will allow us to have a top-level
analyze
function that can produce an entire JAMS object for an input track:Metadata wlil be pulled from the track using
pytaglib
, and a warning will be issued if metadata cannot be found.When executed as a script and without a
-o
flag, it will serialize the JAMS tostdout
:Model development
For each estimator that requires a pre-trained model, the training scripts will be stored under
training/ESTIMATOR/
and have the following filename convention:requirements.txt
: requirements file for training this model. This should only be used for helper modules to facilitate training (eg, muda or pescador), and cannot be required for test-time prediction.index_train.json
: a json file listing the (relative) paths to training data as(audio, jams)
pairs. Must be parse-able into apandas
dataframe.index_test.json
: like above, but for testing data.README.md
: description of the model architecture, parameters, training strategy, etc.00-setup.py
: any necessary preliminary processing. This includes things like pre-computed data augmentation.01-prepare.py
: pump construction, preliminary feature extractionpump
asresources/pump.pkl
(pickle)02-train.py
: model construction and training.index_train.json
, working path, andpump
objectresources/model.h5
03-evaluate.py
: testingmir_eval
function on each estimateresources/test_data.json
Other conventions:
track_id.augment_id.ext
convention.