Closed ReshitkoM closed 3 years ago
It was a bug. It has been fixed in the latest state of the code. The assertion is placed under if condition now: https://github.com/openvinotoolkit/mmdetection/blob/ote/mmdet/datasets/pipelines/transforms.py#L1369
Please pull the latest changes and don't forget to update submodules. If problem still exists please let us know.
@Ilya-Krylov problem solved, thank you!
Hi. I am trying to add some albumentation augmentations to my object detection training pipeline. I follow this steps closely https://github.com/openvinotoolkit/training_extensions/tree/develop/models/object_detection/model_templates/custom-object-detection except modifying model.py to add albumentations to training pipeline. I get error KeyError: 'texts'. My model.py file looks like this:
long traceback
> python train.py --load-weights ${WORK_DIR}/snapshot.pth --train-ann-files ${TRAIN_ANN_FILE} --train-data-roots ${TRAIN_IMG_ROOT} --val-ann-files ${VAL_ANN_FILE} --val-data-roots ${VAL_IMG_ROOT} --save-checkpoints-to ${WORK_DIR}/outputs WARNING:root:Set of classes that will be used in current training does not equal to classes stored in snapshot: ['vehicle', 'person', 'non-vehicle'] vs [] INFO:root:Commandline: train.py --load-weights /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/tmp/lnpr2/snapshot.pth --train-ann-files /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/../../data/airport/annotation_example_train.json --train-data-roots /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/../../data/airport/train --val-ann-files /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/../../data/airport/annotation_example_val.json --val-data-roots /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/../../data/airport/val --save-checkpoints-to /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/tmp/lnpr2/outputs INFO:root:Training started ... INFO:root:Training on GPUs started ... WARNING:root:available_gpu_num < args.gpu_num: 1 < 3 WARNING:root:decreased number of gpu to: 1 /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/venv/lib/python3.8/site-packages/mmcv/utils/registry.py:63: UserWarning: The old API of register_module(module, force=False) is deprecated and will be removed, please use the new API register_module(name=None, force=False, module=None) instead. warnings.warn( 2021-04-16 10:27:09,448 - mmdet - INFO - Environment info: ------------------------------------------------------------ > sys.platform: linux Python: 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0] CUDA available: True CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.2, V10.2.89 GPU 0: GeForce GTX 1070 GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 PyTorch: 1.5.1 PyTorch compiling details: PyTorch built with: - GCC 7.3 - C++ Version: 201402 - Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc) - OpenMP 201511 (a.k.a. OpenMP 4.5) - NNPACK is enabled - CPU capability usage: AVX2 - CUDA Runtime 10.2 - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37 - CuDNN 7.6.5 - Magma 2.5.2 - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, >TorchVision: 0.6.1 OpenCV: 4.5.2-openvino MMCV: 0.6.2 MMDetection: 2.1.0+7fc2e1a MMDetection Compiler: GCC 9.3 MMDetection CUDA Compiler: 10.2 NNCF: 1.6.0 ONNX: 1.8.1 ONNXRuntime: 1.7.0 OpenVINO MO: 2021.3.0-2787-60059f2c755-releases/2021/3 OpenVINO IE: 2.1.2021.3.0-2787-60059f2c755-releases/2021/3 ------------------------------------------------------------ >2021-04-16 10:27:09,448 - mmdet - INFO - Distributed training: True loading annotations into memory... Done (t=0.00s) creating index... index created! 2021-04-16 10:27:10,926 - mmdet - INFO - Config: input_size = 256 image_width = 256 image_height = 256 width_mult = 1.0 model = dict( type='SingleStageDetector', backbone=dict( type='mobilenetv2_w1', out_indices=(4, 5), frozen_stages=-1, norm_eval=False, pretrained=True), neck=None, bbox_head=dict( type='SSDHead', num_classes=3, in_channels=(96, 320), anchor_generator=dict( type='SSDAnchorGeneratorClustered', strides=(16, 32), widths=[[ 11.777124212603184, 27.156337561336, 78.40999192363739, 42.895380750113695 ], [ 63.14842447887146, 115.46481026459409, 213.49145695359056, 138.2245536906473, 234.80364875556538 ]], heights=[[ 14.767053135155848, 45.49947844712648, 45.981733925746965, 98.66743124119586 ], [ 177.24583777391308, 110.80317279721478, 95.85334315816411, 206.86475765838003, 220.30258590019886 ]]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=(0.0, 0.0, 0.0, 0.0), target_stds=(0.1, 0.1, 0.2, 0.2)), depthwise_heads=True, depthwise_heads_activations='relu', loss_balancing=True)) cudnn_benchmark = True train_cfg = dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.4, neg_iou_thr=0.4, min_pos_iou=0.0, ignore_iof_thr=-1, gt_max_assign_all=False), smoothl1_beta=1.0, use_giou=False, use_focal=False, allowed_border=-1, pos_weight=-1, neg_pos_ratio=3, debug=False) test_cfg = dict( nms=dict(type='nms', iou_thr=0.45), min_bbox_size=0, score_thr=0.02, max_per_img=200) dataset_type = 'CocoDataset' **albu_train_transforms = [ dict( type='ShiftScaleRotate', shift_limit=0.0625, scale_limit=0.1, rotate_limit=3, interpolation=1, p=0.3), dict( type='RandomBrightnessContrast', brightness_limit=[0.1, 0.3], contrast_limit=[0.1, 0.3], p=0.2), dict( type='OneOf', transforms=[ dict( type='RGBShift', r_shift_limit=10, g_shift_limit=10, b_shift_limit=10, p=1.0), dict( type='HueSaturationValue', hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, p=1.0) ], p=0.1), dict(type='ImageCompression', quality_lower=75, quality_upper=95, p=0.2), dict(type='ChannelShuffle', p=0.1), dict( type='OneOf', transforms=[ dict(type='Blur', blur_limit=3, p=1.0), dict(type='MedianBlur', blur_limit=3, p=1.0) ], p=0.1)** ] img_norm_cfg = dict(mean=[0, 0, 0], std=[255, 255, 255], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(256, 256), keep_ratio=False), dict( **type='Albu', transforms=[ dict( type='ShiftScaleRotate', shift_limit=0.0625, scale_limit=0.1, rotate_limit=3, interpolation=1, p=0.3), dict( type='RandomBrightnessContrast', brightness_limit=[0.1, 0.3], contrast_limit=[0.1, 0.3], p=0.2), dict( type='OneOf', transforms=[ dict( type='RGBShift', r_shift_limit=10, g_shift_limit=10, b_shift_limit=10, p=1.0), dict( type='HueSaturationValue', hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, p=1.0) ], p=0.1), dict( type='ImageCompression', quality_lower=75, quality_upper=95, p=0.2), dict(type='ChannelShuffle', p=0.1), dict( type='OneOf', transforms=[ dict(type='Blur', blur_limit=3, p=1.0), dict(type='MedianBlur', blur_limit=3, p=1.0) ], p=0.1)** **], bbox_params=dict( type='BboxParams', format='pascal_voc', label_fields=['gt_labels'], min_visibility=0.0, filter_lost_elements=True), keymap=dict(img='image', gt_bboxes='bboxes'), update_pad_shape=False, skip_img_without_anno=True),** dict(type='Normalize', mean=[0, 0, 0], std=[255, 255, 255], to_rgb=True), dict(type='RandomFlip', flip_ratio=0.5), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(256, 256), flip=False, transforms=[ dict(type='Resize', keep_ratio=False), dict( type='Normalize', mean=[0, 0, 0], std=[255, 255, 255], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=64, workers_per_gpu=4, train=dict( type='RepeatDataset', times=5, dataset=dict( type='CocoDataset', ann_file= '/home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/../../data/airport/annotation_example_train.json', img_prefix= '/home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/../../data/airport/train', pipeline=[ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(256, 256), keep_ratio=False), **dict( type='Albu', transforms=[ dict( type='ShiftScaleRotate', shift_limit=0.0625, scale_limit=0.1, rotate_limit=3, interpolation=1, p=0.3),** dict( type='RandomBrightnessContrast', brightness_limit=[0.1, 0.3], contrast_limit=[0.1, 0.3], p=0.2), dict( type='OneOf', transforms=[ dict( type='RGBShift', r_shift_limit=10, g_shift_limit=10, b_shift_limit=10, p=1.0), dict( type='HueSaturationValue', hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, p=1.0) ], p=0.1), dict( type='ImageCompression', quality_lower=75, quality_upper=95, p=0.2), dict(type='ChannelShuffle', p=0.1), dict( type='OneOf', transforms=[ dict(type='Blur', blur_limit=3, p=1.0), dict(type='MedianBlur', blur_limit=3, p=1.0) ], p=0.1) ], bbox_params=dict( type='BboxParams', format='pascal_voc', label_fields=['gt_labels'], min_visibility=0.0, filter_lost_elements=True), keymap=dict(img='image', gt_bboxes='bboxes'), update_pad_shape=False, skip_img_without_anno=True), dict( type='Normalize', mean=[0, 0, 0], std=[255, 255, 255], to_rgb=True), dict(type='RandomFlip', flip_ratio=0.5), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], classes=['vehicle', 'person', 'non-vehicle'])), val=dict( type='CocoDataset', ann_file= '/home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/../../data/airport/annotation_example_val.json', img_prefix= '/home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/../../data/airport/val', test_mode=True, pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(256, 256), flip=False, transforms=[ dict(type='Resize', keep_ratio=False), dict( type='Normalize', mean=[0, 0, 0], std=[255, 255, 255], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ], classes=['vehicle', 'person', 'non-vehicle']), test=dict( type='CocoDataset', ann_file='data/airport/annotation_example_val.json', img_prefix='data/airport/val', test_mode=True, pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(256, 256), flip=False, transforms=[ dict(type='Resize', keep_ratio=False), dict( type='Normalize', mean=[0, 0, 0], std=[255, 255, 255], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ], classes=['vehicle', 'person', 'non-vehicle'])) optimizer = dict(type='SGD', lr=0.05, momentum=0.9, weight_decay=0.0005) optimizer_config = dict() lr_config = dict( policy='step', warmup='linear', warmup_iters=1200, warmup_ratio=0.3333333333333333, step=[8, 11, 13]) checkpoint_config = dict(interval=1) log_config = dict( interval=10, hooks=[dict(type='TextLoggerHook'), dict(type='TensorboardLoggerHook')]) total_epochs = 15 dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = '/home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/tmp/lnpr2/outputs' load_from = '/home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/tmp/lnpr2/snapshot.pth' resume_from = '' workflow = [('train', 1)] gpu_ids = range(0, 1) >2021-04-16 10:27:10,994 - mmdet - WARNING - Decreased samples_per_gpu to: 50 because of dataset length: 50 and gpus number: 1 The model and loaded state dict do not match exactly >size mismatch for bbox_head.cls_convs.0.3.weight: copying a param with shape torch.Size([324, 96, 1, 1]) from checkpoint, the shape in current model is torch.Size([16, 96, 1, 1]). >size mismatch for bbox_head.cls_convs.0.3.bias: copying a param with shape torch.Size([324]) from checkpoint, the shape in current model is torch.Size([16]). >size mismatch for bbox_head.cls_convs.1.3.weight: copying a param with shape torch.Size([405, 320, 1, 1]) from checkpoint, the shape in current model is torch.Size([20, 320, 1, 1]). >size mismatch for bbox_head.cls_convs.1.3.bias: copying a param with shape torch.Size([405]) from checkpoint, the shape in >current model is torch.Size([20]). >loading annotations into memory... Done (t=0.00s) creating index... index created! 2021-04-16 10:27:13,589 - mmdet - INFO - Start running, host: user@user-MS-7B22, work_dir: /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/tmp/lnpr2/outputs 2021-04-16 10:27:13,589 - mmdet - INFO - workflow: [('train', 1)], max: 15 epochs /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/venv/lib/python3.8/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:572: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations (np.object, string), /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/venv/lib/python3.8/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:573: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations (np.bool, bool), /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/venv/lib/python3.8/site-packages/tensorboard/util/tensor_util.py:113: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations np.object: SlowAppendObjectArrayToTensorProto, /home/user/Parking_project/0_openvino_train/training_extensions/models/object_detection/venv/lib/python3.8/site-packages/tensorboard/util/tensor_util.py:114: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations np.bool: SlowAppendBoolArrayToTensorProto, Traceback (most recent call last): File "/home/user/Parking_project/0_openvino_train/training_extensions/external/mmdetection/tools/train.py", line 307, inThe error is deep in albumntations code, so idk if the issue is my code or not