Open ZwwWayne opened 2 years ago
It seems this is caused by log_metric_by_epoch
is not set to False.
Do we have any doc to discuss about the switch between epoch/iter based running?
It seems this is caused by
log_metric_by_epoch
is not set to False.
Which part of the config should also be modified? default_hooks.logger?
It seems this is caused by
log_metric_by_epoch
is not set to False.Which part of the config should also be modified? default_hooks.logger?
Yes. Now we do not have a doc to tell how to switch between epoch/iter 😢, it should be added in document refactoring.
Thanks for your error report and we appreciate it a lot.
Checklist
Describe the bug In IterBasedTrainLoop, when training models and evaluating the model regularly, we cannot see the curve of the metric value (e.g., mIoU) change with the iteration number. This is because the step number is not correctly saved. We can find the in the saved vis_data/scalars.json that all the step value of the evaluation step is 0, which should change with the training. A wrong example looks like below:
Reproduction
Simply train an iter-based model in MMSegmentation should reproduce the error and can find the issue in Tensorboard.
A typical config looks like
Environment
python mmdet/utils/collect_env.py
to collect necessary environment information and paste it here.$PATH
,$LD_LIBRARY_PATH
,$PYTHONPATH
, etc.)Error traceback If applicable, paste the error trackback here.
Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!