tensorflow / models

Models and examples built with TensorFlow
Other
76.81k stars 45.82k forks source link

AssertionError: Some objects had attributes which were not restored #8041

Closed pcraman closed 4 years ago

pcraman commented 4 years ago

System information

You can collect some of this information using our environment capture script:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

You can obtain the TensorFlow version with

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the problem

Describe the problem clearly here. Be sure to convey here why it's a bug in TensorFlow or a feature request.

Current repository doesn't provide Bert Multilingual pretrained model for download. So obtained the model from BERT-Base, Multilingual Cased (New, recommended). Converted to TF2 compatible version using the script tf1_to_keras_checkpoint_converter.py available in the released version. Trying to run the run_classifier with the converted checkpoint fails with the AssertionError: Some objects had attributes which were not restored similar to the issue https://github.com/tensorflow/models/issues/7412

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.

Instructions for updating:
Restoring a name-based tf.train.Saver checkpoint using the object-based restore API. This mode uses global names to match variables, and so is somewhat fragile. It also adds new restore ops to the graph each time it is called when graph building. Prefer re-encoding training checkpoints in the object-based format: run save() on the object-based saver (the same one this message is coming from) and use that checkpoint in the future.
W0113 16:05:10.320911 4773436864 deprecation.py:323] From /opt/anaconda3/envs/ner-bert-tf-2.0/lib/python3.7/site-packages/tensorflow_core/python/training/tracking/util.py:1249: NameBasedSaverStatus.__init__ (from tensorflow.python.training.tracking.util) is deprecated and will be removed in a future version.
Instructions for updating:
Restoring a name-based tf.train.Saver checkpoint using the object-based restore API. This mode uses global names to match variables, and so is somewhat fragile. It also adds new restore ops to the graph each time it is called when graph building. Prefer re-encoding training checkpoints in the object-based format: run save() on the object-based saver (the same one this message is coming from) and use that checkpoint in the future.
Traceback (most recent call last):
  File "/Users/poornima/python_projects/ner-bert-tf-2.0/models-2.0/official/nlp/bert/run_classifier.py", line 321, in <module>
    app.run(main)
  File "/opt/anaconda3/envs/ner-bert-tf-2.0/lib/python3.7/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/opt/anaconda3/envs/ner-bert-tf-2.0/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/Users/poornima/python_projects/ner-bert-tf-2.0/models-2.0/official/nlp/bert/run_classifier.py", line 314, in main
    run_bert(strategy, input_meta_data)
  File "/Users/poornima/python_projects/ner-bert-tf-2.0/models-2.0/official/nlp/bert/run_classifier.py", line 287, in run_bert
    run_eagerly=FLAGS.run_eagerly)
  File "/Users/poornima/python_projects/ner-bert-tf-2.0/models-2.0/official/nlp/bert/run_classifier.py", line 164, in run_bert_classifier
    custom_callbacks=None)
  File "/Users/poornima/python_projects/ner-bert-tf-2.0/models-2.0/official/nlp/bert/run_classifier.py", line 208, in run_keras_compile_fit
    checkpoint.restore(init_checkpoint).assert_existing_objects_matched()
  File "/opt/anaconda3/envs/ner-bert-tf-2.0/lib/python3.7/site-packages/tensorflow_core/python/training/tracking/util.py", line 958, in assert_existing_objects_matched
    return self.assert_consumed()
  File "/opt/anaconda3/envs/ner-bert-tf-2.0/lib/python3.7/site-packages/tensorflow_core/python/training/tracking/util.py", line 943, in assert_consumed
    "".join(unused_attribute_strings)))
AssertionError: Some objects had attributes which were not restored:
    MirroredVariable:{
  0 /job:localhost/replica:0/task:0/device:CPU:0: <tf.Variable 'save_counter:0' shape=() dtype=int64, numpy=0>
}: ['save_counter']

@saberkun Please advise.

saberkun commented 4 years ago

Hi, you need to run tf2_encoder_checkpoint_converter.py as the second step to transformer name-based checkpoint to object-based checkpoint.

pcraman commented 4 years ago

thanks @saberkun for the response. I converted using the script tf2_checkpoint_converter.py available in the release project https://github.com/tensorflow/models/releases/tag/v2.0 as the second step. I observed some warnings like this :

W0115 10:20:46.411640 4665613760 ag_logging.py:145] Entity <bound method Dense3D.call of <official.nlp.bert_modeling.Dense3D object at 0x6c813aa90>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense3D.call of <official.nlp.bert_modeling.Dense3D object at 0x6c813aa90>>: AssertionError: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:Entity <bound method Dense3D.call of <official.nlp.bert_modeling.Dense3D object at 0x6c6f01250>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense3D.call of <official.nlp.bert_modeling.Dense3D object at 0x6c6f01250>>: AssertionError: Bad argument number for Name: 3, expecting 4

But I am able to load the final checkpoints successfully and perform fine tuning. Does this mean I can ignore these warnings?

saberkun commented 4 years ago

Yes, I think so. This is auto graph related error. It may affect performance but I personally don't feel this might be an issue. Note that, the checkpoint converter script is updated in the master to fit our new bert implementation.

ibrahimishag commented 4 years ago

I am getting the following warning or error when executing tf2_encoder_checkpoint_converter.py

---
W0421 11:28:46.766433 139936457463616 deprecation.py:323] From /home/user/anaconda3/envs/TF1/lib/python3.7/site-packages/tensorflow_core/python/training/tracking/util.py:1249: NameBasedSaverStatus.__init__ (from tensorflow.python.training.tracking.util) is deprecated and will be removed in a future version.
Instructions for updating:
Restoring a name-based tf.train.Saver checkpoint using the object-based restore API. This mode uses global names to match variables, and so is somewhat fragile. It also adds new restore ops to the graph each time it is called when graph building. Prefer re-encoding training checkpoints in the object-based format: run save() on the object-based saver (the same one this message is coming from) and use that checkpoint in the future.

Note: I am using earlier versions of tf1_to_keras_checkpoint_converter.py and tf2_encoder_checkpoint_converter.py Could you please explain whether the conversion is complete or not?.