Closed chadlagore closed 6 years ago
After some investigation, I'm leaning towards pickling. The workflow is much simpler and it lets us store the Speaker
s and things too.
Downside to pickling: audio data is pickled along with everything else. Pickled BaseModel
's might get very large, and would consume valuable memory space when read back in as the parent to a Minutes
model.
According to this,
It is not recommended to use pickle or cPickle to save a Keras model.
There are ways around it, but I think I'll avoid the whole mess. Tentative plan:
MINUTES_MODELS_DIRECTORY
).MINUTES_MODELS_DIRECTORY
):
.
├── base
│ └── cnn
│ ├── keras.h5
│ └── __dict__
└── transfer
└── cnn-youtube-Gs26bZTRkdU
├── keras.h5
└── __dict__
BaseModel.save()
will save the keras
model and the self.__dict__
to the folder MINUTES_MODELS_DIRECTORY/base/<name>/
.Minutes.save()
will save the keras
model and the self.__dict__
to the folder MINUTES_MODELS_DIRECTORY/transfer/<name>/
.Allows for a separation between the base and transfer models, user specified Minutes directory, and avoids pickling Keras models or large audio chunks to disk.
The following configuration,
is invalid because the underlying
cnn.h5
hasms_per_observation=3000
.TODO: When a user specifies
model = Minutes(model='cnn')
, we need to inherit the existing model configuration (not just the.h5
model), and reduce the options available to them at the Minutes level (iems_per_observation
is no longer choosable). A couple ideas on how to do this:.h5
file with aconfig
file that contains the parameters of theBaseModel
used to generate the original model.BaseModel
(saycnn.pkl
) associated with the original model and load that in when someone executesMinutes(model='cnn')
.