AttributeError: 'Live' object has no attribute 'set_step'

Divergense commented 1 year ago

Thanks for reporting the unexpected results and we appreciate it a lot.

Checklist

[x] I have searched related issues but cannot get the expected help.
[x] I have read the FAQ documentation but cannot get the expected help.
[x] The unexpected results still exist in the latest version.

Describe the Issue

MMCV hook DvcliveLoggerHook has a bug. The class placed at mmcv/runner/hooks/logger/dvclive.py uses wrong (old) API of DVCLive library (1.3.2 is current version). Exactly the method log(self, runner) uses self.dvclive.set_step and self.dvclive.log methods. These methods don't exist in present DVCLive version.

Reproduction

What command, code, or script did you run?
```
train_detector(model, datasets, cfg, distributed=False, validate=True)
```
Note: the code fully corresponds to the mmdetection/demo/MMDet_Tutorial.ipynb guide (v2.27.0).
Did you make any modifications on the code? Did you understand what you have modified?

I tried to adopt MMDetection with DVC (tracking experiment metrics) and added DvcliveLoggerHook like this:
```
cfg.log_config.hooks = [
        dict(type='TextLoggerHook'),
        dict(type='TensorboardLoggerHook'), 
        dict(type='DvcliveLoggerHook', report="auto"),
        ]
```

Environment

Please run python -c "from mmcv.utils import collect_env; print(collect_env())" to collect necessary environment information and paste it here:

 'CUDA available': False,
 'GCC': 'x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0',
 'MMCV': '1.7.0',
 'MMCV CUDA Compiler': 'not available',
 'MMCV Compiler': 'GCC 7.5',
 'OpenCV': '4.6.0',
 'PyTorch': '1.13.0+cu116',
 'PyTorch compiling details': 'PyTorch built with:\n'
                              '  - GCC 9.3\n'
                              '  - C++ Version: 201402\n'
                              '  - Intel(R) Math Kernel Library Version '
                              '2020.0.0 Product Build 20191122 for Intel(R) 64 '
                              'architecture applications\n'
                              '  - Intel(R) MKL-DNN v2.6.0 (Git Hash '
                              '52b5f107dd9cf10910aaa19cb47f3abf9b349815)\n'
                              '  - OpenMP 201511 (a.k.a. OpenMP 4.5)\n'
                              '  - LAPACK is enabled (usually provided by '
                              'MKL)\n'
                              '  - NNPACK is enabled\n'
                              '  - CPU capability usage: AVX2\n'
                              '  - Build settings: BLAS_INFO=mkl, '
                              'BUILD_TYPE=Release, CUDA_VERSION=11.6, '
                              'CUDNN_VERSION=8.3.2, '
                              'CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, '
                              'CXX_FLAGS= -fabi-version=11 -Wno-deprecated '
                              '-fvisibility-inlines-hidden -DUSE_PTHREADPOOL '
                              '-fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM '
                              '-DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK '
                              '-DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE '
                              '-DEDGE_PROFILER_USE_KINETO -O2 -fPIC '
                              '-Wno-narrowing -Wall -Wextra '
                              '-Werror=return-type -Werror=non-virtual-dtor '
                              '-Wno-missing-field-initializers '
                              '-Wno-type-limits -Wno-array-bounds '
                              '-Wno-unknown-pragmas -Wunused-local-typedefs '
                              '-Wno-unused-parameter -Wno-unused-function '
                              '-Wno-unused-result -Wno-strict-overflow '
                              '-Wno-strict-aliasing '
                              '-Wno-error=deprecated-declarations '
                              '-Wno-stringop-overflow -Wno-psabi '
                              '-Wno-error=pedantic -Wno-error=redundant-decls '
                              '-Wno-error=old-style-cast '
                              '-fdiagnostics-color=always -faligned-new '
                              '-Wno-unused-but-set-variable '
                              '-Wno-maybe-uninitialized -fno-math-errno '
                              '-fno-trapping-math -Werror=format '
                              '-Werror=cast-function-type '
                              '-Wno-stringop-overflow, LAPACK_INFO=mkl, '
                              'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, '
                              'PERF_WITH_AVX512=1, TORCH_VERSION=1.13.0, '
                              'USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, '
                              'USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, '
                              'USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, '
                              'USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n',
 'Python': '3.8.16 (default, Dec  7 2022, 01:12:13) [GCC 7.5.0]',
 'TorchVision': '0.14.0+cu116',
 'sys.platform': 'linux'

Error traceback

Traceback (most recent call last):
  File "src/train.py", line 35, in <module>
    main(args)
  File "src/train.py", line 29, in main
    train_detector(model, datasets, cfg, distributed=False, validate=True)
  File "/content/dvc-mmdetection-example/mmdetection/mmdet/apis/train.py", line 246, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
    self.call_hook('after_train_iter')
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/base_runner.py", line 317, in call_hook
    getattr(hook, fn_name)(self)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/hooks/logger/base.py", line 158, in after_train_iter
    self.log(runner)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/dist_utils.py", line 144, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/mmcv/runner/hooks/logger/dvclive.py", line 61, in log
    self.dvclive.set_step(self.get_iter(runner))
AttributeError: 'Live' object has no attribute 'set_step'

Bug fix

Since i don't know what actual behavior is expected (all DVCLive methods produce slightly different results and i was so lazy to find working version of DVCLive) i can make just assumptions. Minimal working changes for me are the following:

replace self.dvclive.set_step(self.get_iter(runner)) with self.dvclive.step = self.get_iter(runner) in log(self, runner) method
replace self.dvclive.log(k, v) line with self.dvclive.log_metric(k, v) in log(self, runner) method

Later i needed more info and added the following lines in the method:

self.dvclive.make_summary()
self.dvclive.make_report()

Final changes look like this:

@master_only
def log(self, runner) -> None:
    tags = self.get_loggable_tags(runner)
    if tags:
        self.dvclive.step = self.get_iter(runner)
        for k, v in tags.items():
            self.dvclive.log_metric(k, v)
        self.dvclive.make_summary()
        self.dvclive.make_report()

zhouzaida commented 1 year ago

Hi @Divergense , thank you for your feedback, and sorry for our late reply. We were on Chinese New Year last week. It seems like the dvclive caused a bc issue. We can call different interfaces according to the version of dvclive. Are you interested in making a PR to fix this issue?

Divergense commented 1 year ago

Hi! Yes, i'm interested to make a PR but i never do it before and hence i need to learn the contributing guidelines first.

And i have a couple of questions (sorry for stupid questions):

what does mean "a bc issue"?
can you please tell me about "We can call different interfaces according to the version of dvclive"?

zhouzaida commented 1 year ago

bc issue means the interfaces of dvclive had been changed and caused the downstream repos to fail.
We can check from which version the interface was modified and handle it differently depending on the version

open-mmlab / mmcv

AttributeError: 'Live' object has no attribute 'set_step' #2562