saraghsm / sound-of-failure

MIT License
4 stars 3 forks source link

Saving best VAE model and reloading for optimization #6

Open wrijupan opened 3 years ago

wrijupan commented 3 years ago

The ngdlm model for Variational Autoencoder currently uses a custom defined loss function (see vae_loss here.)

If in the training script, ModelCheckpoint is enabled, it saves the best VAE model which was compiled using the custom-defined loss function mentioned above.

Now if the function load_saved_model is called from this script, it throws a ValueError: 'Unknown loss function' if the loaded model needs to be compiled as well for further optimization. The workaround is to set compile=False : loaded_model = tf.keras.models.load_model(model_path, compile=False).

At present, the saved VAE model should be only used for inference, i.e. predicting (for an example of how to train a VAE, see this notebook). If it needs to be loaded and further optimized, currently it does not work. Did you encounter this before? This is a possible workaround that is still not implemented in this work (see this).

saraghsm commented 3 years ago

Hi Wriju,

I'm not familiar with this subject, if you want to further optimise a model, say by varying one of its hypetparameters, then you would need to train the model again, so I don't understand why would you want to load your already saved model? What's the benefit of that?

wrijupan commented 3 years ago

It is not always necessary. But sometimes one might one to freeze some layers and retrain the others. For us it might not be required, but the custom defined loss function even prohibits the possibility, which is slightly annoying.

NikoHobel commented 3 years ago

Now this looks like a pretty good solution to me. It's from the reference you provided. In the forum they call it ugly, but I don't know what they mean...

image
NikoHobel commented 3 years ago

I find it also quite important that the stored models are flexible, can be trained further (colab might have crashed), used for other purposes etc.