mlcommons / training

Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.6k stars 553 forks source link

Image classification-AttributeError: 'str' object has no attribute '_hypers_created' #555

Closed Yffffan closed 2 months ago

Yffffan commented 2 years ago

I want to recreate the project of Resnet50 and Imagenet, and follow the instructions of readme.md (https://github.com/mlcommons/training/blob/master/image_classification/README.md), I get the following error:

AttributeError: in user code:

/home/huangyangfan/MLcommons/training/image_classification/tensorflow2/tf2_common/training/utils.py:92 loop_fn  *
    step_fn(iterator)
/home/huangyangfan/MLcommons/training/image_classification/tensorflow2/resnet_runnable.py:338 _apply_grads_and_clear_for_each_replica  *
    self.optimizer.apply_gradients(
/home/huangyangfan/anaconda3/envs/envmlperf/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:625 apply_gradients  **
    apply_state = self._prepare(var_list)
/home/huangyangfan/anaconda3/envs/envmlperf/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:881 _prepare
    self._prepare_local(var_device, var_dtype, apply_state)
/home/huangyangfan/MLcommons/training/image_classification/tensorflow2/lars_optimizer.py:115 _prepare_local
    lr_t = self._get_hyper("learning_rate", var_dtype)         #
/home/huangyangfan/anaconda3/envs/envmlperf/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:753 _get_hyper
    if not self._hypers_created:

AttributeError: 'str' object has no attribute '_hypers_created'

And if I comment out --optimizer=LARS, I will not report this error, but the training effect is very poor, accuracy keeps decreasing.

How can I solve this problem? Thanks all!

johntran-nv commented 1 year ago

@sgpyc could you advise?

itayhubara commented 1 year ago

I could not reproduce it. If you still have this issue - please share the full environment setting (TF version, machine configuration (TPU, GPU and which type), python version, etc.)

hiwotadese commented 2 months ago

Closing this because the benchmark is retired.