open-mmlab / mmpose

OpenMMLab Pose Estimation Toolbox and Benchmark.
https://mmpose.readthedocs.io/en/latest/
Apache License 2.0
5.75k stars 1.23k forks source link

COCOeval' object has no attribute 'score_key' #1155

Closed YuktiADY closed 2 years ago

YuktiADY commented 2 years ago

Hello,

While training i am getting AttributeError: 'COCOeval' object has no attribute 'score_key'

Traceback (most recent call last): File "./mmpose/tools/train.py", line 170, in main() File "./mmpose/tools/train.py", line 166, in main meta=meta) File "/home/yukti/mmpose/mmpose/mmpose/apis/train.py", line 192, in train_model runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/yukti/mmpose/mmcv/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], kwargs) File "/home/yukti/mmpose/mmcv/mmcv/runner/epoch_based_runner.py", line 54, in train self.call_hook('after_train_epoch') File "/home/yukti/mmpose/mmcv/mmcv/runner/base_runner.py", line 309, in call_hook getattr(hook, fn_name)(self) File "/home/yukti/mmpose/mmcv/mmcv/runner/hooks/evaluation.py", line 267, in after_train_epoch self._do_evaluate(runner) File "/home/yukti/mmpose/mmcv/mmcv/runner/hooks/evaluation.py", line 505, in _do_evaluate key_score = self.evaluate(runner, results) File "/home/yukti/mmpose/mmpose/mmpose/core/evaluation/eval_hooks.py", line 139, in evaluate self.eval_kwargs) File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 317, in evaluate info_str = self._do_python_keypoint_eval(res_file) File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 371, in _do_python_keypoint_eval coco_eval.evaluate() File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 257, in evaluate for imgId in p.imgIds File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 258, in for catId in catIds} File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 311, in computeOks inds = np.argsort([-d[self.score_key] for d in dts], kind='mergesort') File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 311, in inds = np.argsort([-d[self.score_key] for d in dts], kind='mergesort') AttributeError: 'COCOeval' object has no attribute 'score_key'

ly015 commented 2 years ago

@jin-s13 Could you please take a look at this issue? Is this something related to xtcocotools?

YuktiADY commented 2 years ago

@jin-s13 Could you please take a look at this issue? Is this something related to xtcocotools?

Please let me know how to proceed further I am stuck here.

jin-s13 commented 2 years ago

Have you tried training with the official config?

YuktiADY commented 2 years ago

Have you tried training with the official config?

Yes I took the original config and just training on my dataset( changing the bbox file path, train and val paths ) in the config files.

YuktiADY commented 2 years ago

Have you tried training with the official config?

Yes I took the original config and just training on my dataset( changing the bbox file path, train and val paths ) in the config files.

Please find the config below

base = ['/home/yukti/mmpose/mmpose/configs/base/datasets/theodore.py'] log_level = 'INFO' load_from = None resume_from = None dist_params = dict(backend='nccl') workflow = [('train', 1)] checkpoint_config = dict(interval=10) evaluation = dict(interval=1, metric='mAP',save_best='AP')

optimizer = dict( type='Adam', lr=5e-4, ) optimizer_config = dict(grad_clip=None)

learning policy

lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.001, step=[170, 200]) total_epochs = 30 log_config = dict( interval=50, hooks=[ dict(type='TextLoggerHook'), dict(type='TensorboardLoggerHook') ])

channel_cfg = dict( num_output_channels=17, dataset_joints=17, dataset_channel=[ list(range(17)), ], inference_channel=list(range(17)))

model settings

model = dict( type='TopDown', pretrained='torchvision://resnet50', backbone=dict(type='ResNet', depth=50), keypoint_head=dict( type='TopdownHeatmapSimpleHead', in_channels=2048, out_channels=channel_cfg['num_output_channels'], loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)), train_cfg=dict(), test_cfg=dict( flip_test=True, post_process='default', shift_heatmap=True, modulate_kernel=11))

data_cfg = dict( image_size=[192, 256], heatmap_size=[48, 64], num_output_channels=channel_cfg['num_output_channels'], num_joints=channel_cfg['dataset_joints'], dataset_channel=channel_cfg['dataset_channel'], inference_channel=channel_cfg['inference_channel'], soft_nms=False, nms_thr=1.0, oks_thr=0.9, vis_thr=0.2, use_gt_bbox=False, det_bbox_thr=0.0, bbox_file='/mnt/dst_datasets/own_omni_dataset/theodore_plus_v2/coco_annotations/' 'person_bbox_valid.json', )

train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='TopDownRandomFlip', flip_prob=0.5), dict( type='TopDownHalfBodyTransform', num_joints_half_body=8, prob_half_body=0.3), dict( type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5), dict(type='TopDownAffine'), dict(type='ToTensor'), dict( type='NormalizeTensor', mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), dict(type='TopDownGenerateTarget', sigma=2), dict( type='Collect', keys=['img', 'target', 'target_weight'], meta_keys=[ 'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale', 'rotation', 'bbox_score', 'flip_pairs' ]), ]

val_pipeline = [ dict(type='LoadImageFromFile'), dict(type='TopDownAffine'), dict(type='ToTensor'), dict( type='NormalizeTensor', mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), dict( type='Collect', keys=['img'], meta_keys=[ 'image_file', 'center', 'scale', 'rotation', 'bbox_score', 'flip_pairs' ]), ]

test_pipeline = val_pipeline

dataset settings

dataset_type = 'TheodorePlusV2Dataset'

data_root = '/mnt/dst_datasets/own_omni_dataset/theodore_plus_v2' data = dict( samples_per_gpu=64, #batch size workers_per_gpu=16, # Workers to pre-fetch data for each single GPU val_dataloader=dict(samples_per_gpu=32), test_dataloader=dict(samples_per_gpu=32), train=dict( type=dataset_type, ann_file=f'{data_root}/coco_annotations/person_keypoints_valid.json', img_prefix=f'{data_root}/valid/img_png/', data_cfg=data_cfg, pipeline=train_pipeline, dataset_info={{base.dataset_info}}), val=dict( type=dataset_type, ann_file=f'{data_root}/coco_annotations/person_keypoints_valid.json', img_prefix=f'{data_root}/valid/img_png/', data_cfg=data_cfg, pipeline=val_pipeline, dataset_info={{base.dataset_info}}), test=dict( type=dataset_type, ann_file=f'{data_root}/coco_annotations/person_keypoints_valid.json', img_prefix=f'{data_root}/valid/img_png/', data_cfg=data_cfg, pipeline=test_pipeline, dataset_info={{base.dataset_info}}), )

jin-s13 commented 2 years ago

It seems that the problem occurs in TheodorePlusV2Dataset. Can you post the codes of Dataset?

YuktiADY commented 2 years ago

from .topdown_coco_dataset import TopDownCocoDataset

from ...builder import DATASETS

@DATASETS.register_module(name='TheodorePlusV2Dataset') class TheodorePlusV2Dataset(TopDownCocoDataset): def init(self, ann_file, img_prefix, data_cfg, pipeline, dataset_info=None, test_mode=False):

    if dataset_info is None:
        warnings.warn(
            'dataset_info is missing. '
            'Check https://github.com/open-mmlab/mmpose/pull/663 '
            'for details.', DeprecationWarning)
        cfg = Config.fromfile('configs/_base_/datasets/theodore.py')
        dataset_info = cfg._cfg_dict['dataset_info']

    super().__init__(
        ann_file,
        img_prefix,
        data_cfg,
        pipeline,
        dataset_info=dataset_info,
        test_mode=test_mode)

    self.use_gt_bbox = data_cfg['use_gt_bbox']
    self.bbox_file = data_cfg['bbox_file']
    self.det_bbox_thr = data_cfg.get('det_bbox_thr', 0.0)
    self.use_nms = data_cfg.get('use_nms', True)
    self.soft_nms = data_cfg['soft_nms']
    self.nms_thr = data_cfg['nms_thr']
    self.oks_thr = data_cfg['oks_thr']
    self.vis_thr = data_cfg['vis_thr']

    self.db = self._get_db()

    print(f'=> num_images: {self.num_images}')
    print(f'=> load {len(self.db)} samples')
jin-s13 commented 2 years ago

TheodorePlusV2Dataset looks good.

  1. Did you modify the codes of TopDownCocoDataset?
  2. Another possible problem is the format of your dataset gt / bbox json.

Please first setuse_gt_bbox=True to check if the problem is from the bbox file.

YuktiADY commented 2 years ago
  1. Did you modify the codes of TopDownCocoDataset? [Ans] That is the existing file I didn't modify anything in topdown_coco_dataset.py.
  2. Another possible problem is the format of your dataset gt / bbox json. [Ans] You means the bbox_file path is not correct ?. So, should I first set it to true and train again ??
jin-s13 commented 2 years ago

The path may be correct but I am not sure whether the content is good.

You do not need to re-train the model, just evaluating the model is fine.

YuktiADY commented 2 years ago

The path may be correct but I am not sure whether the content is good.

You do not need to re-train the model, just evaluating the model is fine.

Yes, I am evaluating the model because to make it running training takes a lot of time. I made the changes , lets see what the output is ..

YuktiADY commented 2 years ago

The path may be correct but I am not sure whether the content is good.

You do not need to re-train the model, just evaluating the model is fine.

I set that use_gt_box = true .

After evaluating the model reproduces the same error,

2022-01-26 12:58:49,219 - mmpose - INFO - Epoch [1][50/155] lr: 4.945e-05, eta: 11:43:11, time: 9.172, data_time: 8.771, memory: 6691, heatmap_loss: 0.0026, acc_pose: 0.0510, loss: 0.0026 2022-01-26 13:03:01,436 - mmpose - INFO - Epoch [1][100/155] lr: 9.940e-05, eta: 8:59:02, time: 5.044, data_time: 4.531, memory: 6691, heatmap_loss: 0.0023, acc_pose: 0.1685, loss: 0.0023 2022-01-26 13:06:44,415 - mmpose - INFO - Epoch [1][150/155] lr: 1.494e-04, eta: 7:46:54, time: 4.460, data_time: 3.951, memory: 6691, heatmap_loss: 0.0021, acc_pose: 0.2835, loss: 0.0021 [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 19828/19827, 30.6 task/s, elapsed: 649s, ETA: 0s

Loading and preparing results... DONE (t=0.62s) creating index... index created! Running per image evaluation... Evaluate annotation type keypoints Traceback (most recent call last): File "./mmpose/tools/train.py", line 170, in main() File "./mmpose/tools/train.py", line 166, in main meta=meta) File "/home/yukti/mmpose/mmpose/mmpose/apis/train.py", line 192, in train_model runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/yukti/mmpose/mmcv/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], kwargs) File "/home/yukti/mmpose/mmcv/mmcv/runner/epoch_based_runner.py", line 54, in train self.call_hook('after_train_epoch') File "/home/yukti/mmpose/mmcv/mmcv/runner/base_runner.py", line 309, in call_hook getattr(hook, fn_name)(self) File "/home/yukti/mmpose/mmcv/mmcv/runner/hooks/evaluation.py", line 267, in after_train_epoch self._do_evaluate(runner) File "/home/yukti/mmpose/mmcv/mmcv/runner/hooks/evaluation.py", line 505, in _do_evaluate key_score = self.evaluate(runner, results) File "/home/yukti/mmpose/mmpose/mmpose/core/evaluation/eval_hooks.py", line 139, in evaluate self.eval_kwargs) File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 317, in evaluate info_str = self._do_python_keypoint_eval(res_file) File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 371, in _do_python_keypoint_eval coco_eval.evaluate() File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 257, in evaluate for imgId in p.imgIds File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 258, in for catId in catIds} File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 311, in computeOks inds = np.argsort([-d[self.score_key] for d in dts], kind='mergesort') File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 311, in inds = np.argsort([-d[self.score_key] for d in dts], kind='mergesort') AttributeError: 'COCOeval' object has no attribute 'score_key'

I guess the problem is not from bbox file. then what is the issue ??

jin-s13 commented 2 years ago

Do you have coco dataset? Maybe you can evaluate a pretrained model using the latest mmpose to make sure that your environment is good.

YuktiADY commented 2 years ago

Do you have coco dataset? Maybe you can evaluate a pretrained model using the latest mmpose to make sure that your environment is good.

I am already evaluating on pretrained model. You can see below ,

model settings

model = dict( type='TopDown', pretrained='torchvision://resnet50', backbone=dict(type='ResNet', depth=50), keypoint_head=dict( type='TopdownHeatmapSimpleHead', in_channels=2048, out_channels=channel_cfg['num_output_channels'], loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)), train_cfg=dict(), test_cfg=dict( flip_test=True, post_process='default', shift_heatmap=True, modulate_kernel=11))

jin-s13 commented 2 years ago

pretrained='torchvision://resnet50', is an imageNet pre-trained model.

I mean directly evaluate the model in the modelzoo on COCO val set. And see if the accuracy can be obtained, or the error still occurs.

YuktiADY commented 2 years ago

pretrained='torchvision://resnet50', is an imageNet pre-trained model.

I mean directly evaluate the model in the modelzoo on COCO val set. And see if the accuracy can be obtained, or the error still occurs.

How much time it will take usually?? Because currently there are 210 epochs so it will take some days to complete evaluation.

jin-s13 commented 2 years ago

Please download our trained model, instead of retraining yourself.

jin-s13 commented 2 years ago

Please check https://mmpose.readthedocs.io/en/latest/topics/body%282d%2Ckpt%2Csview%2Cimg%29.html#topdown-heatmap-resnet-on-coco

pose_resnet_50 | 256x192 | 0.718 | 0.898 | 0.795 | 0.773 | 0.937 | ckpt | log

Click "ckpt" to download the model.

YuktiADY commented 2 years ago

Please check https://mmpose.readthedocs.io/en/latest/topics/body%282d%2Ckpt%2Csview%2Cimg%29.html#topdown-heatmap-resnet-on-coco

pose_resnet_50 | 256x192 | 0.718 | 0.898 | 0.795 | 0.773 | 0.937 | ckpt | log

Click "ckpt" to download the model.

Can we not give pretrained ='res50_coco_256x192-ec54d7f3_20200709.pth' in the cfg file?? because while evaluating this pretrained model it gives error FileNotFoundError: file "/home/yukti/mmpose/coco/res50_coco_256x192.py/res50_coco_256x192-ec54d7f3_20200709.pth" does not exist.

To run the script I gave command ./mmpose/tools/dist_train.sh ./coco/res50_coco_256x192.py/res50_coco_256x192-ec54d7f3_20200709.pth 2 --work-dir ./theodore_2022-01-27/

Please tell me if its correct or file format is not correct?

jin-s13 commented 2 years ago

The path looks strange. ./coco/res50_coco_256x192.py/res50_coco_256x192-ec54d7f3_20200709.pth

The name of the directory ends with .py?

YuktiADY commented 2 years ago

The path looks strange. ./coco/res50_coco_256x192.py/res50_coco_256x192-ec54d7f3_20200709.pth

The name of the directory ends with .py?

As you told when I clicked on 'ckpt' the file got downloaded with this name 'res50_coco_256x192-ec54d7f3_20200709.pth' In the cfg in the model settings , I have this

model settings

model = dict( type='TopDown',

pretrained='torchvision://resnet50',

pretrained ='res50_coco_256x192-ec54d7f3_20200709.pth',
backbone=dict(type='ResNet', depth=50),
jin-s13 commented 2 years ago

./coco/res50_coco_256x192.py/res50_coco_256x192-ec54d7f3_20200709.pth

I am actually talking about "res50_coco_256x192.py". Is it a typo?

jin-s13 commented 2 years ago

pretrained='torchvision://resnet50',

This will automatically download the required pre-trained model. Or you may set it as None.

YuktiADY commented 2 years ago

./coco/res50_coco_256x192.py/res50_coco_256x192-ec54d7f3_20200709.pth

I am actually talking about "res50_coco_256x192.py". Is it a typo? no this is what error says because in the pretrained i passed the downloade pretrained model .

YuktiADY commented 2 years ago

pretrained='torchvision://resnet50',

This will automatically download the required pre-trained model. Or you may set it as None.

So i should not pass the downloaded pretrained model res50_coco_256x192-ec54d7f3_20200709.pth in cfg??

YuktiADY commented 2 years ago

pretrained='torchvision://resnet50',

This will automatically download the required pre-trained model. Or you may set it as None.

So i should not pass the downloaded pretrained model res50_coco_256x192-ec54d7f3_20200709.pth in cfg??

So I will let it be pretrained='torchvision://resnet50',

To run the script I gave command ./mmpose/tools/dist_train.sh ./coco/res50_coco_256x192.py/res50_coco_256x192-ec54d7f3_20200709.pth 2 --work-dir ./theodore_2022-01-27/

Is this script correct, I gave this file here?

jin-s13 commented 2 years ago

You should not pass the downloaded model res50_coco_256x192-ec54d7f3_20200709.pth in cfg.

There are actually two different models. To avoid confusion, I would like to name them as (1) image-net pre-trained model 'torchvision://resnet50' and (2) our coco trained model 'res50_coco_256x192-ec54d7f3_20200709.pth'.

The config file is exactly the same for training and testing. In the config, just set pretrained='torchvision://resnet50'.

The following commands are used for testing.

./tools/dist_test.sh configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/res50_coco_256x192.py \
    checkpoints/SOME_CHECKPOINT.pth 1 \
    --eval mAP

Please try this first.

jin-s13 commented 2 years ago

In your case, this might be

./tools/dist_test.sh configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/res50_coco_256x192.py ./res50_coco_256x192-ec54d7f3_20200709.pth 1 --eval mAP
jin-s13 commented 2 years ago

https://mmpose.readthedocs.io/en/latest/getting_started.html#inference-with-pre-trained-models

YuktiADY commented 2 years ago

https://mmpose.readthedocs.io/en/latest/getting_started.html#inference-with-pre-trained-models

Yes i saw this and then gave path..

YuktiADY commented 2 years ago

In your case, this might be

./tools/dist_test.sh configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/res50_coco_256x192.py ./res50_coco_256x192-ec54d7f3_20200709.pth 1 --eval mAP

Yes but little different path bcoz i didnt place the file in config so, ./mmpose/tools/dist_test.sh ./coco/res50_coco_256x192.py ./res50_coco_256x192-ec54d7f3_20200709.pth 2 --eval mAP

YuktiADY commented 2 years ago

Is it necessary to keep the downloaded pretrained file where dataset is placed??

YuktiADY commented 2 years ago

Is it necessary to keep the downloaded pretrained file where dataset is placed?? Because I am getting this error ./res50_coco_256x192-ec54d7f3_20200709.pth can not be found.

jin-s13 commented 2 years ago

How about using the absolute path?

YuktiADY commented 2 years ago

Yes gave that : ./mmpose/tools/dist_test.sh ./coco/res50_coco_256x192.py https://download.openmmlab.com/mmpose/top_down/resnet/res50_coco_256x192-ec54d7f3_20200709.pth 2 --eval mAP

YuktiADY commented 2 years ago

Did it take too soon to validate .Beacause its done in couple of minutes to evaluate 5000 images in COCO val set . After the running the script i got this.

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 104126/104125, 418.3 task/s, elapsed: 249s, ETA: 0sLoading and preparing results... DONE (t=3.32s) creating index... index created! Running per image evaluation... Evaluate annotation type keypoints DONE (t=9.53s). Accumulating evaluation results... DONE (t=0.25s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.718 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.898 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.795 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.646 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.745 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.773 Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.937 Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.841 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.729 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.837 AP: 0.7176335311367283 AP (L): 0.744909473374687 AP (M): 0.6462446384085971 AP .5: 0.8984441197238628 AP .75: 0.7948567126745394 AR: 0.7730321158690178 AR (L): 0.8366406540319584 AR (M): 0.7292543021032504 AR .5: 0.9370277078085643 AR .75: 0.8408375314861462 INFO:torch.distributed.elastic.agent.server.api:[default] worker group successfully finished. Waiting 300 seconds for other agents to finish. INFO:torch.distributed.elastic.agent.server.api:Local worker group finished (SUCCEEDED). Waiting 300 seconds for other agents to finish /home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/elastic/utils/store.py:71: FutureWarning: This is an experimental API and will be changed in future. "This is an experimental API and will be changed in future.", FutureWarning INFO:torch.distributed.elastic.agent.server.api:Done waiting for other agents. Elapsed: 0.0006189346313476562 seconds {"name": "torchelastic.worker.status.SUCCEEDED", "source": "WORKER", "timestamp": 0, "metadata": {"run_id": "none", "global_rank": 0, "group_rank": 0, "worker_id": "2632779", "role": "default", "hostname": "dst-toaster.etit.tu-chemnitz.de", "state": "SUCCEEDED", "total_run_time": 310, "rdzv_backend": "static", "raw_error": null, "metadata": "{\"group_world_size\": 1, \"entry_point\": \"python\", \"local_rank\": [0], \"role_rank\": [0], \"role_world_size\": [2]}", "agent_restarts": 0}} {"name": "torchelastic.worker.status.SUCCEEDED", "source": "WORKER", "timestamp": 0, "metadata": {"run_id": "none", "global_rank": 1, "group_rank": 0, "worker_id": "2632780", "role": "default", "hostname": "dst-toaster.etit.tu-chemnitz.de", "state": "SUCCEEDED", "total_run_time": 310, "rdzv_backend": "static", "raw_error": null, "metadata": "{\"group_world_size\": 1, \"entry_point\": \"python\", \"local_rank\": [1], \"role_rank\": [1], \"role_world_size\": [2]}", "agent_restarts": 0}} {"name": "torchelastic.worker.status.SUCCEEDED", "source": "AGENT", "timestamp": 0, "metadata": {"run_id": "none", "global_rank": null, "group_rank": 0, "worker_id": null, "role": "default", "hostname": "dst-toaster.etit.tu-chemnitz.de", "state": "SUCCEEDED", "total_run_time": 310, "rdzv_backend": "static", "raw_error": null, "metadata": "{\"group_world_size\": 1, \"entry_point\": \"python\"}", "agent_restarts": 0}}

jin-s13 commented 2 years ago

The results look good.

YuktiADY commented 2 years ago

The results look good.

But isnt the evaluation is done toot fast and AP is around 89 % . In the config I made these changes and epochs i gave just 20. I hope its fine and doesnt affect the results.

train=dict( type='TopDownCocoDataset', ann_file=f'{data_root}/annotations/person_keypoints_val2017.json', img_prefix=f'{data_root}/images/val2017/', data_cfg=data_cfg, pipeline=train_pipeline, dataset_info={{base.dataset_info}}), val=dict( type='TopDownCocoDataset', ann_file=f'{data_root}/annotations/person_keypoints_val2017.json', img_prefix=f'{data_root}/images/val2017/', data_cfg=data_cfg, pipeline=val_pipeline, dataset_info={{base.dataset_info}}), test=dict( type='TopDownCocoDataset', ann_file=f'{data_root}/annotations/person_keypoints_val2017.json', img_prefix=f'{data_root}/images/val2017/', data_cfg=data_cfg, pipeline=test_pipeline, dataset_info={{base.dataset_info}}), )

YuktiADY commented 2 years ago

That means now the evaluation on pretrained model on COCO val set looks fine and since there are no errors. So its not the environment issue ??

So, the error that I was getting while evaluating on my custom dataset COCOeval' object has no attribute 'score_key' . It has to do with something else ??

YuktiADY commented 2 years ago

I am getting this error while evaluating on custom dataset.

Loading and preparing results... DONE (t=0.64s) creating index... index created! Running per image evaluation... Evaluate annotation type keypoints Traceback (most recent call last): File "./mmpose/tools/train.py", line 170, in main() File "./mmpose/tools/train.py", line 166, in main meta=meta) File "/home/yukti/mmpose/mmpose/mmpose/apis/train.py", line 192, in train_model runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/yukti/mmpose/mmcv/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], kwargs) File "/home/yukti/mmpose/mmcv/mmcv/runner/epoch_based_runner.py", line 54, in train self.call_hook('after_train_epoch') File "/home/yukti/mmpose/mmcv/mmcv/runner/base_runner.py", line 309, in call_hook getattr(hook, fn_name)(self) File "/home/yukti/mmpose/mmcv/mmcv/runner/hooks/evaluation.py", line 267, in after_train_epoch self._do_evaluate(runner) File "/home/yukti/mmpose/mmcv/mmcv/runner/hooks/evaluation.py", line 505, in _do_evaluate key_score = self.evaluate(runner, results) File "/home/yukti/mmpose/mmpose/mmpose/core/evaluation/eval_hooks.py", line 139, in evaluate self.eval_kwargs) File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 317, in evaluate info_str = self._do_python_keypoint_eval(res_file) File "/home/yukti/mmpose/mmpose/mmpose/datasets/datasets/top_down/topdown_coco_dataset.py", line 371, in _do_python_keypoint_eval coco_eval.evaluate() File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 257, in evaluate for imgId in p.imgIds File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 258, in for catId in catIds} File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 311, in computeOks inds = np.argsort([-d[self.score_key] for d in dts], kind='mergesort') File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/xtcocotools/cocoeval.py", line 311, in inds = np.argsort([-d[self.score_key] for d in dts], kind='mergesort') AttributeError: 'COCOeval' object has no attribute 'score_key'

I wanted to ask in evaluation .py the parameter used is key_score for storing the self.evaluate.

def _do_evaluate(self, runner): """perform evaluation and save ckpt.""" results = self.test_fn(runner.model, self.dataloader) runner.log_buffer.output['eval_iter_num'] = len(self.dataloader) key_score = self.evaluate(runner, results)

the key_score may be None so it needs to skip the action to save

    # the best checkpoint
    if self.save_best and key_score:
        self._save_ckpt(runner, key_score)
YuktiADY commented 2 years ago

I am stuck with this error . Is it related to some version issue ? Please tell me how to fix this issue, because i cant proceed with the training.

YuktiADY commented 2 years ago

I have analysed that in https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/cocoeval.py they have used the score parameter not the score_key.

In our coceval file can we do change from score_key to score parameter ??

jin-s13 commented 2 years ago

That means now the evaluation on pretrained model on COCO val set looks fine and since there are no errors. So its not the environment issue ??

Yes. The evaluation on the COCO val set looks fine, and the result (0.718 mAP) is correct.

So, the error that I was getting while evaluating on my custom dataset COCOeval' object has no attribute 'score_key' . It has to do with something else ??

You have already successfully run the COCO evaluation. So what is the difference between your custom dataset and COCO dataset? I will suggest you carefully checking your data annotations. Compare it with coco_annotations/person_keypoints_valid.json, and see if there are any differences.

BTW, we are using xtcocotools instead of pycocotools. So please refer to https://github.com/jin-s13/xtcocoapi/blob/master/xtcocotools/cocoeval.py

YuktiADY commented 2 years ago

COCO annotation .
{"info": {"description": "COCO 2017 Dataset","url": "http://cocodataset.org","version": "1.0","year": 2017,"contributor": "COCO Consortium","date_created": "2017/09/01"},"licenses": [{"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/","id": 1,"name": "Attribution-NonCommercial-ShareAlike License"},{"url": "http://creativecommons.org/licenses/by-nc/2.0/","id": 2,"name": "Attribution-NonCommercial License"},{"url": "http://creativecommons.org/licenses/by-nc-nd/2.0/","id": 3,"name": "Attribution-NonCommercial-NoDerivs License"},{"url": "http://creativecommons.org/licenses/by/2.0/","id": 4,"name": "Attribution License"},{"url": "http://creativecommons.org/licenses/by-sa/2.0/","id": 5,"name": "Attribution-ShareAlike License"},{"url": "http://creativecommons.org/licenses/by-nd/2.0/","id": 6,"name": "Attribution-NoDerivs License"},{"url": "http://flickr.com/commons/usage/","id": 7,"name": "No known copyright restrictions"},{"url": "http://www.usa.gov/copyright.shtml","id": 8,"name": "United States Government Work"}],"images": [{"license": 4,"file_name": "000000397133.jpg","coco_url": "http://images.cocodataset.org/val2017/000000397133.jpg","height": 427,"width": 640,"date_captured": "2013-11-14 17:02:52","flickr_url": "http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg","id": 397133},{"license": 1,"file_name": "000000037777.jpg","coco_url": "http://images.cocodataset.org/val2017/000000037777.jpg","height": 230,"width": 352,"date_captured": "2013-11-14 20:55:31","flickr_url": "http://farm9.staticflickr.com/8429/7839199426_f6d48aa585_z.jpg","id": 37777},{"license": 4,"file_name": "000000252219.jpg","coco_url": "http://images.cocodataset.org/val2017/000000252219.jpg","height": 428,"width": 640,"date_captured": "2013-11-14 22:32:02","flickr_url": "http://farm4.staticflickr.com/3446/3232237447_13d84bd0a1_z.jpg","id": 252219},{"license": 1,"file_name": "000000087038.jpg","coco_url": "http://images.cocodataset.org/val2017/000000087038.jpg","height": 480,"width": 640,"date_captured": "2013-11-14 23:11:37","flickr_url": "http://farm8.staticflickr.com/7355/8825114508_b0fa4d7168_z.jpg","id": 87038},{"license": 6,"file_name": "000000174482.jpg","coco_url": "http://images.cocodataset.org/val2017/000000174482.jpg","height": 388,"width": 640,"date_captured": "2013-11-14 23:16:55","flickr_url": "http://farm8.staticflickr.com/7020/6478877255_242f741dd1_z.jpg","id": 174482}

My annotation . {"info": {"description": "Theodore+", "url": "https://www.tu-chemnitz.de/etit/dst/forschung/comp_vision/theodore/index.php.en", "version": "1.0", "year": "2021", "contributor": "Chair of Digital und Circuit Design", "date_created": "08/20/21"}, "licenses": [{"url": "https://creativecommons.org/licenses/by/4.0/", "id": 1, "name": "Attribution 4.0 International (CC BY 4.0)"}], "categories": [{"supercategory": "person", "id": 1, "name": "person", "keypoints": ["nose", "left_eye", "right_eye", "left_ear", "right_ear", "left_shoulder", "right_shoulder", "left_elbow", "right_elbow", "left_wrist", "right_wrist", "left_hip", "right_hip", "left_knee", "right_knee", "left_ankle", "right_ankle"], "skeleton": [[16, 14], [14, 12], [17, 15], [15, 13], [12, 13], [6, 12], [7, 13], [6, 7], [6, 8], [7, 9], [8, 10], [9, 11], [2, 3], [1, 2], [1, 3], [2, 4], [3, 5], [4, 6], [5, 7]]}], "images": [{"file_name": "0000115_img.png", "id": 115, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""}, {"file_name": "0000139_img.png", "id": 139, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""}, {"file_name": "0000174_img.png", "id": 174, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""}, {"file_name": "0000189_img.png", "id": 189, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""}, {"file_name": "0000197_img.png", "id": 197, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""}, {"file_name": "0000248_img.png", "id": 248, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""}, {"file_name": "0000262_img.png", "id": 262, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""}, {"file_name": "0000288_img.png", "id": 288, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""}, {"file_name": "0000296_img.png", "id": 296, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""}, {"file_name": "0000325_img.png", "id": 325, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""}, {"file_name": "0000334_img.png", "id": 334, "width": 2048, "height": 2048, "coco_url": "", "flickr_url": "", "license": 1, "date_captured": ""},

Does the num_keypoints makes any difference ? Is it necessary to give coco_url path ?? We cant make change in cocoeval.py

Difference between custom dataset and COCO dataset is that in our dataset the images are taken from the omnidirectional camera that is from the top and in the coco dataset the images are taken from front.

jin-s13 commented 2 years ago
  1. num_keypoints is necessary. It means the number of valid keypoints.
  2. we do not need url path.

see https://github.com/open-mmlab/mmpose/blob/master/docs/en/tutorials/2_new_dataset.md

Please check both "images" and "annotations" in the json.

YuktiADY commented 2 years ago
  1. num_keypoints is necessary. It means the number of valid keypoints.

    1. we do not need url path.

see https://github.com/open-mmlab/mmpose/blob/master/docs/en/tutorials/2_new_dataset.md

Please check both "images" and "annotations" in the json.

Yes I have compared the json file for both.

The "images" for my dataset in json file has height = 2048 and width = 2048 for all the images where as compared to COCO images every image has different height and width . Does this have impact while evaluating. Similarly,in annotations I have observed some differences.

1.For my dataset annotations there are no segmentation field . where as in coco annotations segmentation field is there . Is it necessary to give segmentation??

  1. The num_keypoints in coco is 10 , in my annotations its 13. Does these keypoints needs to same no what is mentioned in the config below. channel_cfg = dict( num_output_channels=17, dataset_joints=17, dataset_channel=[ list(range(17)), ], inference_channel=list(range(17)))
  2. category _id for my annotation is given as "1" whereas in coco is given as 1. I don't think so that make any difference??
  3. id for my case is started like 0,1,2,3,.... where as in coco is 183126,183302...I think this is like assigning arbitrary values to "id" key, or there is any logic behind this. This doesn't make any difference??

Could you please comment on this ??

jin-s13 commented 2 years ago

Happy Chinese New Year! Sorry for the late reply, I was on holiday recently.

0. The "images" for my dataset in json file has height = 2048 and width = 2048 for all the images That is fine.

  1. It is not necessary to give segmentation. But if you do not have area and segm, please use https://github.com/open-mmlab/mmpose/blob/dca589a0388530d4e387d1200744ad35dd30768d/mmpose/datasets/datasets/top_down/topdown_aic_dataset.py#L95

  2. The num_keypoints is a variable instead of a fixed number. Different persons should have different num_keypoints. This means the valid number of keypoints of a person. If a person is occluded, num_keypoints will be smaller than 17. num_output_channels is the maximum number of the keypoint a person can have, 17 in coco.

  3. Not sure about it. Please use category _id = 1

  4. Yes. The order of the id is not important.

YuktiADY commented 2 years ago

Happy Chinese New Year! Sorry for the late reply, I was on holiday recently.

0. The "images" for my dataset in json file has height = 2048 and width = 2048 for all the images That is fine.

1. It is not necessary to give segmentation. But if you do not have area and segm, please use https://github.com/open-mmlab/mmpose/blob/dca589a0388530d4e387d1200744ad35dd30768d/mmpose/datasets/datasets/top_down/topdown_aic_dataset.py#L95

2. The num_keypoints is a variable instead of a fixed number. Different persons should have different num_keypoints. This means the valid number of keypoints of a person. If a person is occluded, num_keypoints will be smaller than 17. num_output_channels is the maximum number of the keypoint a person can have, 17 in coco.

3. Not sure about it. Please use category _id = 1

4. Yes. The order of the id is not important.

Happy new year ! Its ok no worries, actually I am still stuck at this error so had few doubts regarding annotations.

  1. Area is there but not segmentation. So thought of adding segmentation masks in the annotation. Will that be fine ?? I have area only where i need to use this function _do_python_keypoint_eval ?? Why is it required.??

  2. I have also analysed in the code that error came in line 311 in function computeOks.

def computeOks(self, imgId, catId): p = self.params

dimention here should be Nxm

    gts = self._gts[imgId, catId]
    dts = self._dts[imgId, catId]
    inds = np.argsort([-d[self.score_key] for d in dts], kind='mergesort')    ----- error here in this line.

But in line 223 we have :

self._dts[dt['image_id'], dt['category_id']].append(dt)

In our annotation image_id is integer value and category_id is string ("1"). But in coco annotations category_id is integer data type(1) . So, the above line of code will append two different datatypes i.e string and integer which can't be possible. These category_id and image_id variables are used in function computeOks(self, imgId, catId). This can be one of the cause of this error. So, i thought we should pass the integer value to category_id.

YuktiADY commented 2 years ago

Happy Chinese New Year! Sorry for the late reply, I was on holiday recently. 0. The "images" for my dataset in json file has height = 2048 and width = 2048 for all the images That is fine.

1. It is not necessary to give segmentation. But if you do not have area and segm, please use https://github.com/open-mmlab/mmpose/blob/dca589a0388530d4e387d1200744ad35dd30768d/mmpose/datasets/datasets/top_down/topdown_aic_dataset.py#L95

2. The num_keypoints is a variable instead of a fixed number. Different persons should have different num_keypoints. This means the valid number of keypoints of a person. If a person is occluded, num_keypoints will be smaller than 17. num_output_channels is the maximum number of the keypoint a person can have, 17 in coco.

3. Not sure about it. Please use category _id = 1

4. Yes. The order of the id is not important.

Happy new year ! Its ok no worries, actually I am still stuck at this error so had few doubts regarding annotations.

1. Area is there but not segmentation. So thought of adding segmentation masks in the annotation. Will that be fine ??
   I have area only where i need to use this function _do_python_keypoint_eval ?? Why is it required.??

2. I have also analysed in the code that error came in line 311 in function computeOks.

def computeOks(self, imgId, catId): p = self.params # dimention here should be Nxm gts = self._gts[imgId, catId] dts = self._dts[imgId, catId] inds = np.argsort([-d[self.score_key] for d in dts], kind='mergesort') ----- error here in this line.

But in line 223 we have :

self._dts[dt['image_id'], dt['category_id']].append(dt)

In our annotation image_id is integer value and category_id is string ("1"). But in coco annotations category_id is integer data type(1) . So, the above line of code will append two different datatypes i.e string and integer which can't be possible. These category_id and image_id variables are used in function computeOks(self, imgId, catId). This can be one of the cause of this error. So, i thought we should pass the integer value to category_id.

I did evaluation on val set on our dataset with (evaluation interval =1) the results are below. The only thing i changed in json file is category_id from "1" to 1 but didnt add segmentation.

I am getting traceback too , but i think thats not a problem its not error right ??

Loading and preparing results... DONE (t=0.82s) creating index... index created! Running per image evaluation... Evaluate annotation type keypoints DONE (t=3.96s). Accumulating evaluation results... DONE (t=0.17s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.225 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.798 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.007 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.229 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.303 Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.870 Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.064 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.032 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.309 2022-02-03 15:19:51,690 - mmpose - INFO - Now best checkpoint is saved as best_AP_epoch_1.pth. 2022-02-03 15:19:51,690 - mmpose - INFO - Best AP is 0.2247 at 1 epoch. 2022-02-03 15:19:51,692 - mmpose - INFO - Epoch(val) [1][312] AP: 0.2247, AP .5: 0.7982, AP .75: 0.0066, AP (M): 0.0005, AP (L): 0.2286, AR: 0.3032, AR .5: 0.8701, AR .75: 0.0639, AR (M): 0.0319, AR (L): 0.3087 2022-02-03 15:27:26,809 - mmpose - INFO - Epoch [2][50/155] lr: 2.043e-04, eta: 8:21:42, time: 9.102, data_time: 8.750, memory: 6691, heatmap_loss: 0.0019, acc_pose: 0.3676, loss: 0.0019 2022-02-03 15:31:37,542 - mmpose - INFO - Epoch [2][100/155] lr: 2.542e-04, eta: 7:50:48, time: 5.015, data_time: 4.663, memory: 6691, heatmap_loss: 0.0018, acc_pose: 0.4039, loss: 0.0018 ^CTraceback (most recent call last): File "./mmpose/tools/train.py", line 170, in main() File "./mmpose/tools/train.py", line 166, in main meta=meta) File "/home/yukti/mmpose/mmpose/mmpose/apis/train.py", line 192, in train_model runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/yukti/mmpose/mmcv/mmcv/runner/epoch_based_runner.py", line 127, in run epoch_runner(data_loaders[i], kwargs) File "/home/yukti/mmpose/mmcv/mmcv/runner/epoch_based_runner.py", line 47, in train for i, data_batch in enumerate(self.data_loader): File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1186, in _next_data idx, data = self._get_data() File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1142, in _get_data success, data = self._try_get_data() File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 990, in _try_get_data data = self._data_queue.get(timeout=timeout) File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/queue.py", line 179, in get self.not_empty.wait(remaining) File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/threading.py", line 300, in wait gotit = waiter.acquire(True, timeout) KeyboardInterrupt Traceback (most recent call last): File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 173, in main() File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 169, in main run(args) File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/run.py", line 624, in run )(cmd_args) File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 116, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 348, in wrapper return f(args, kwargs) File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 238, in launch_agent result = agent.run() File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/elastic/metrics/api.py", line 125, in wrapper result = f(*args, kwargs) File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/elastic/agent/server/api.py", line 700, in run result = self._invoke_run(role) File "/home/yukti/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/elastic/agent/server/api.py", line 828, in _invoke_run time.sleep(monitor_interval)

  1. Now when I am training, i wanted to ask that in the config , evaluation = (interval =1 ) This means that after epoch 1 it will evaluate the results indicating the AP etc. But in your existing config the interval = 10 is set, this means after epoch [10] the evaluation will be done ?? If yes, then that means depending on this interval value we have again and again run that script of training and depending on no of epochs for eg 30 so need to run training 3 times ??

Please correct me if i am thinking in the wrong direction.

jin-s13 commented 2 years ago
  1. Hi, the evaluation looks good. It seems that category_id from "1" to 1 is the key. segmentation is not necessary. area is used. You can see from the log, we have "area=medium", " area= large" right? We use area to categorize different area groups.

  2. But in your existing config the interval = 10 is set, this means after epoch [10] the evaluation will be done ?? Yes. The evaluation will be done every 10 epochs.

  3. then that means depending on this interval value we have again and again run that script of training and depending on no of epochs for eg 30 so need to run training 3 times ??

If we set "the total number of epoch = 30, and interval =10", it means that we will perform evaluation after 10th, 20th, and 30th epoch. The training process will not be interrupted.