Reduce the Options Exposed to the User at the Minutes Level

chadlagore commented 6 years ago

The following configuration,

model = Minutes(model='cnn', ms_per_observation=1000)

is invalid because the underlying cnn.h5 has ms_per_observation=3000.

TODO: When a user specifies model = Minutes(model='cnn'), we need to inherit the existing model configuration (not just the .h5 model), and reduce the options available to them at the Minutes level (ie ms_per_observation is no longer choosable). A couple ideas on how to do this:

Create folder for each model. Accompany each .h5 file with a config file that contains the parameters of the BaseModel used to generate the original model.
Pickle the BaseModel (say cnn.pkl) associated with the original model and load that in when someone executes Minutes(model='cnn').

chadlagore commented 6 years ago

After some investigation, I'm leaning towards pickling. The workflow is much simpler and it lets us store the Speakers and things too.

chadlagore commented 6 years ago

Downside to pickling: audio data is pickled along with everything else. Pickled BaseModel's might get very large, and would consume valuable memory space when read back in as the parent to a Minutes model.

chadlagore commented 6 years ago

According to this,

It is not recommended to use pickle or cPickle to save a Keras model.

There are ways around it, but I think I'll avoid the whole mess. Tentative plan:

Establish the local models directory via environment variable (MINUTES_MODELS_DIRECTORY).

Create a file tree like this (within the MINUTES_MODELS_DIRECTORY):

.
├── base
│   └── cnn
│       ├── keras.h5
│       └── __dict__
└── transfer
└── cnn-youtube-Gs26bZTRkdU
    ├── keras.h5
    └── __dict__

BaseModel.save() will save the keras model and the self.__dict__ to the folder MINUTES_MODELS_DIRECTORY/base/<name>/.
Minutes.save() will save the keras model and the self.__dict__ to the folder MINUTES_MODELS_DIRECTORY/transfer/<name>/.

Allows for a separation between the base and transfer models, user specified Minutes directory, and avoids pickling Keras models or large audio chunks to disk.

ubclaunchpad / minutes

Reduce the Options Exposed to the User at the Minutes Level #118