tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow
https://www.tensorflow.org/probability/
Apache License 2.0
4.25k stars 1.1k forks source link

Models using a KLDivergenceRegularizer cannot be saved #1604

Open jsilter opened 2 years ago

jsilter commented 2 years ago

I've been working through the TFP VAE tutorial, and tried to save the model after training. However, this was not possible in SavedModel format.

Command:

model.save(saved_model_path, save_format="tf", save_traces=True)

Produces the following error:

` File "/Users/xxx/venvs/ml_venv/lib/python3.9/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1251, in call * return self._kl_divergence_fn(distribution_a)

File "/Users/xxx/venvs/ml_venv/lib/python3.9/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1370, in _fn  **  
    kl = kl_divergence_fn(distribution_a, distribution_b_)   

File "/Users/xxx/venvs/ml_venv/lib/python3.9/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1354, in kl_divergence_fn. 
    distribution_a.log_prob(z) - distribution_b.log_prob(z),  
AttributeError: 'Tensor' object has no attribute 'log_prob'`. 

Command:

model.save(saved_model_path, save_format="tf", save_traces=False)

Produces the following error:

<tensorflow_probability.python.layers.distribution_layer.KLDivergenceRegularizer object at 0x16b79c5b0> does not implement get_config()

Indeed it does not, KLDivergenceRegularizer does not have a get_config implementation.

Sidenote: I cannot load from h5 because MultivariateNormalTriL requires a positional argument: TypeError: __init__() missing 1 required positional argument: 'event_size'

Versions: Python 3.9.13 tensorflow-probability==0.17.0 tensorflow==2.9.0

This is also discussed in #742 but I'm starting a new issue to make it more prominent.

johnsoltis commented 2 years ago

I am having the same issues, after (loosely) following the same tutorial.

Versions: Python 3.10.4 Tensorflow 2.9.1 Tensorflow Probability 0.17.0

The command it is failing on is: vae.save(file_name)

The error:

Traceback (most recent call last):
  File "/home/jsoltis2/VAE_LATENT3.py", line 279, in <module>
    vae.save(file_name)
  File "/home/jsoltis2/.conda/envs/tf_tfp/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_filete4qkxqr.py", line 13, in tf____call__
    retval_ = ag__.converted_call(ag__.ld(self)._kl_divergence_fn, (ag__.ld(distribution_a),), None, fscope)
  File "/home/jsoltis2/.conda/envs/tf_tfp/lib/python3.10/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1370, in _fn
    kl = kl_divergence_fn(distribution_a, distribution_b_)
  File "/home/jsoltis2/.conda/envs/tf_tfp/lib/python3.10/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1354, in kl_divergence_fn
    distribution_a.log_prob(z) - distribution_b.log_prob(z),
AttributeError: in user code:

    File "/home/jsoltis2/.conda/envs/tf_tfp/lib/python3.10/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1251, in __call__  *
        return self._kl_divergence_fn(distribution_a)
    File "/home/jsoltis2/.conda/envs/tf_tfp/lib/python3.10/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1370, in _fn  **
        kl = kl_divergence_fn(distribution_a, distribution_b_)
    File "/home/jsoltis2/.conda/envs/tf_tfp/lib/python3.10/site-packages/tensorflow_probability/python/layers/distribution_layer.py", line 1354, in kl_divergence_fn
        distribution_a.log_prob(z) - distribution_b.log_prob(z),

    AttributeError: 'Tensor' object has no attribute 'log_prob'

My full model and compilation:

prior = tfd.Independent(tfd.Normal(loc=tf.zeros(latent_dim), scale=1), reinterpreted_batch_ndims=1)

encoder = tfk.Sequential([tfkl.InputLayer(input_shape=(140, 140, 1)), 
    Conv2D(16, kernel_size=(3, 3), activation=None),
    layers.LeakyReLU(),
    Conv2D(32, kernel_size=(3, 3), activation=None),
    layers.LeakyReLU(),
    Conv2D(64, kernel_size=(3, 3), activation=None),
    layers.LeakyReLU(),
    layers.Flatten(),
    tfkl.Dense(tfpl.MultivariateNormalTriL.params_size(latent_dim),activation=None),
    tfpl.MultivariateNormalTriL(latent_dim, activity_regularizer=tfpl.KLDivergenceRegularizer(prior)),]
    ,name='encoder')

encoder.summary()

decoder = tfk.Sequential([tfkl.InputLayer(input_shape=(latent_dim,)),
    Dense(140*140*128, activation=None),
    layers.LeakyReLU(),
    layers.Reshape(target_shape=(140, 140, 128)),
    Conv2DTranspose(64, kernel_size=(3, 3), activation=None, padding='same'),
    layers.LeakyReLU(),
    Conv2DTranspose(32, kernel_size=(3, 3), activation=None, padding='same'),
    layers.LeakyReLU(),
    Conv2DTranspose(16, kernel_size=(3, 3), activation=None, padding='same'),
    layers.LeakyReLU(),
    Conv2DTranspose(1, kernel_size=(3, 3), activation=None, padding='same'),]
    ,name='decoder')
decoder.summary()

vae = tfk.Model(inputs=encoder.inputs, outputs=decoder(encoder.outputs[0]))

vae.compile(optimizer=tf.optimizers.Adam(learning_rate=1e-2), loss='mae', metrics=['mse','mae'])
jsilter commented 2 years ago

Looks like it can be saved using tf.compat.v1.keras.experimental.export_saved_model and loaded with tf.compat.v1.keras.experimental.load_from_saved_model. I'm assuming this doesn't actually save the regularizer and skips over it, which is fine for my purposes. These methods are deprecated, hopefully they won't be removed soon.

i418c commented 10 months ago

I'm running into this on the latest releases as well. It's very confusing to follow a tutorial and have something as basic as model saving not work out of the box.