kamalkraj / ALBERT-TF2.0

ALBERT model Pretraining and Fine Tuning using TF2.0
Apache License 2.0
199 stars 45 forks source link

AttributeError: 'AdamWeightDecay' object has no attribute '_decayed_lr_t' #9

Closed loginaway closed 4 years ago

loginaway commented 4 years ago

Hi,

Thanks for the code! I'm trying to run the code on squad2.0 dataset. I used the Version 2 base model (Not the xxlarge one When I ran

python3 run_squad.py --mode=train_and_predict --input_meta_data_path=${OUTPUTDIR}/squad${SQUAD_VERSION}_meta_data --train_data_path=${OUTPUTDIR}/squad${SQUAD_VERSION}_train.tf_record --predict_file=${SQUAD_DIR}/dev-${SQUAD_VERSION}.json --albert_config_file=${ALBERT_DIR}/config.json --init_checkpoint=${ALBERT_DIR}/tf2_model.h5 --spm_model_file=${ALBERT_DIR}/vocab/30k-clean.model --train_batch_size=24 --predict_batch_size=24 --learning_rate=1.5e-5 --num_train_epochs=3 --model_dir=${OUTPUT_DIR} --strategy_type=mirror --version_2_with_negative --max_seq_length=384

An exception occurred,

AttributeError: 'AdamWeightDecay' object has no attribute '_decayed_lr_t'

i ran the code on Ubuntu 18.04, and my tensorflow versions are as follows

tb-nightly 1.14.0a20190603
tensorboard 1.14.0
tensorflow 2.0.0b1
tensorflow-estimator 1.14.0
tensorflow-gpu 2.0.0b1
tf-estimator-nightly 1.14.0.dev2019060501

Is there something wrong with the versions? The detailed error information are as follows.

W1116 11:33:43.881644 140642252482304 optimizer_v2.py:979] Gradients does not exist for variables ['albert_model/pooler_transform/kernel:0', 'albert_model/pooler_transform/bias:0'] when minimizing the loss. I1116 11:33:43.977660 140648276031296 coordinator.py:219] Error reported to Coordinator: 'AdamWeightDecay' object has no attribute '_decayed_lr_t' Traceback (most recent call last): File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py", line 297, in stop_on_exception yield File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/distribute/mirrored_strategy.py", line 189, in _call_for_each_replica merge_kwargs) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 476, in _distributed_apply var, apply_grad_to_update_var, args=(grad,), group=False)) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py", line 1458, in update return self._update(var, fn, args, kwargs, group) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/distribute/mirrored_strategy.py", line 766, in _update values.select_device_mirrored(d, kwargs))) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 460, in apply_grad_to_update_var grad.values, var, grad.indices) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 850, in _resource_apply_sparse_duplicate_indices return self._resource_apply_sparse(summed_grad, handle, unique_indices) File "/home/cjy/Albert/ALBERT/optimization.py", line 168, in _resource_apply_sparse var.device, var.dtype.base_dtype, apply_state) File "/home/cjy/Albert/ALBERT/optimization.py", line 148, in _get_lr return self._decayed_lr_t[var_dtype], {} File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 542, in getattribute raise e File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 532, in getattribute return super(OptimizerV2, self).getattribute(name) AttributeError: 'AdamWeightDecay' object has no attribute '_decayed_lr_t' Traceback (most recent call last): File "run_squad.py", line 845, in app.run(main) File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 300, in run _run_main(main, args) File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 251, in _run_main sys.exit(main(argv)) File "run_squad.py", line 837, in main train_squad(strategy, input_meta_data) File "run_squad.py", line 742, in train_squad custom_callbacks=custom_callbacks) File "/home/cjy/Albert/ALBERT/model_training_utils.py", line 328, in run_customized_training_loop tf.convert_to_tensor(steps, dtype=tf.int32)) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 416, in call self._initialize(args, kwds, add_initializers_to=initializer_map) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 359, in _initialize *args, kwds)) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1360, in _get_concrete_function_internal_garbage_collected graphfunction, , _ = self._maybe_define_function(args, kwargs) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1648, in _maybe_define_function graph_function = self._create_graph_function(args, kwargs) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1541, in _create_graph_function capture_by_value=self._capture_by_value), File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 716, in func_graph_from_py_func func_outputs = python_func(*func_args, *func_kwargs) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 309, in wrapped_fn return weak_wrapped_fn().wrapped(args, kwds) File "/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 706, in wrapper raise e.ag_error_metadata.to_exception(type(e)) AttributeError: in converted code:

/home/cjy/Albert/ALBERT/model_training_utils.py:239 train_steps  *
    strategy.experimental_run_v2(_replicated_step, args=(next(iterator),))
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py:708 experimental_run_v2
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py:1710 call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/distribute/mirrored_strategy.py:708 _call_for_each_replica
    fn, args, kwargs)
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/distribute/mirrored_strategy.py:195 _call_for_each_replica
    coord.join(threads)
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py:389 join
    six.reraise(*self._exc_info_to_raise)
/usr/lib/python3/dist-packages/six.py:693 reraise
    raise value
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/training/coordinator.py:297 stop_on_exception
    yield
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/distribute/mirrored_strategy.py:189 _call_for_each_replica
    **merge_kwargs)
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:476 _distributed_apply
    var, apply_grad_to_update_var, args=(grad,), group=False))
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py:1458 update
    return self._update(var, fn, args, kwargs, group)
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/distribute/mirrored_strategy.py:766 _update
    **values.select_device_mirrored(d, kwargs)))
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:460 apply_grad_to_update_var
    grad.values, var, grad.indices)
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:850 _resource_apply_sparse_duplicate_indices
    return self._resource_apply_sparse(summed_grad, handle, unique_indices)
/home/cjy/Albert/ALBERT/optimization.py:168 _resource_apply_sparse
    var.device, var.dtype.base_dtype, apply_state)
/home/cjy/Albert/ALBERT/optimization.py:148 _get_lr
    return self._decayed_lr_t[var_dtype], {}
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:542 __getattribute__
    raise e
/home/cjy/.local/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:532 __getattribute__
    return super(OptimizerV2, self).__getattribute__(name)

AttributeError: 'AdamWeightDecay' object has no attribute '_decayed_lr_t'`
loginaway commented 4 years ago

Problem solved. It turns out that my tensorflow version is not compatible with this code.

I tried tensorflow(and -gpu) 2.0.0a0, 2.0.0b0, 2.0.0b2, all failed. And at first I was not able to install tensorflow(and -gpu) 2.0.0 version because my pip version is too low.

So I upgraded my pip

pip3 install --upgrade pip

And reinstalled tensorflow 2.0.0 as well as tensorflow-gpu 2.0.0.

pip3 install tensorflow==2.0.0 pip3 install tensorflow-gpu==2.0.0

Then it worked!