tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.49k stars 319 forks source link

Cannot restore a checkpoint on a pruned model without 'Unresolved object in checkpoint: (root).optimizer.*' #603

Open andrewstanfordjason opened 3 years ago

andrewstanfordjason commented 3 years ago

Prior to filing: check that this should be a bug instead of a feature request. Everything supported, including the compatible versions of TensorFlow, is listed in the overview page of each technique. For example, the overview page of quantization-aware training is here. An issue for anything not supported should be a feature request.

Describe the bug Cannot load checkpoints into a saved model and restore the optimiser. Model weights are restored but it fails to restore optimiser. I get warnings:

WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_1 WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_1 WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_2 WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_2 WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.learning_rate WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.learning_rate

Here is a Colab that reproduces the problem. It trains MNIST, prunes it with checkpoints then tries to reload each checkpoint after training on the saved h5.

https://colab.research.google.com/drive/1Oljrqs0IHwDfEBfz8i-2aIdsfvH5Nkfd?usp=sharing

System information

TensorFlow version (installed from source or binary): 2.3 binary presumably

TensorFlow Model Optimization version (installed from source or binary): 0.5 binary presumably

Python version:

Describe the expected behavior Restore the optimiser without warnings

Describe the current behavior Warns

Code to reproduce the issue Provide a reproducible code that is the bare minimum necessary to generate the problem.

https://colab.research.google.com/drive/1Oljrqs0IHwDfEBfz8i-2aIdsfvH5Nkfd?usp=sharing

andrewstanfordjason commented 3 years ago

Found a work around at https://gist.github.com/yoshihikoueno/4ff0694339f88d579bb3d9b07e609122 (modified)

    adam = tf.keras.optimizers.Adam(
        learning_rate=tf.Variable(0.001),
        beta_1=tf.Variable(0.9),
        beta_2=tf.Variable(0.999),
        epsilon=tf.Variable(1e-7),
        decay = tf.Variable(0.0),
    )
    adam.iterations
    loaded_model.compile(optimizer=adam)
teijeong commented 3 years ago

Hi @andrewstanfordjason , sorry for late response.

Before we start working on this, can you confirm if this problem still happens in the latest versions of TF & TF-MOT?