ubclaunchpad / minutes

:telescope: Speaker diarization via transfer learning
https://medium.com/ubc-launch-pad-software-engineering-blog/speaker-diarisation-using-transfer-learning-47ca1a1226f4
27 stars 5 forks source link

Reduce the Options Exposed to the User at the Minutes Level #118

Closed chadlagore closed 6 years ago

chadlagore commented 6 years ago

The following configuration,

model = Minutes(model='cnn', ms_per_observation=1000)

is invalid because the underlying cnn.h5 has ms_per_observation=3000.

TODO: When a user specifies model = Minutes(model='cnn'), we need to inherit the existing model configuration (not just the .h5 model), and reduce the options available to them at the Minutes level (ie ms_per_observation is no longer choosable). A couple ideas on how to do this:

  1. Create folder for each model. Accompany each .h5 file with a config file that contains the parameters of the BaseModel used to generate the original model.
  2. Pickle the BaseModel (say cnn.pkl) associated with the original model and load that in when someone executes Minutes(model='cnn').
chadlagore commented 6 years ago

After some investigation, I'm leaning towards pickling. The workflow is much simpler and it lets us store the Speakers and things too.

chadlagore commented 6 years ago

Downside to pickling: audio data is pickled along with everything else. Pickled BaseModel's might get very large, and would consume valuable memory space when read back in as the parent to a Minutes model.

chadlagore commented 6 years ago

According to this,

It is not recommended to use pickle or cPickle to save a Keras model.

There are ways around it, but I think I'll avoid the whole mess. Tentative plan:

Allows for a separation between the base and transfer models, user specified Minutes directory, and avoids pickling Keras models or large audio chunks to disk.