open-mmlab / mmyolo

OpenMMLab YOLO series toolbox and benchmark. Implemented RTMDet, RTMDet-Rotated,YOLOv5, YOLOv6, YOLOv7, YOLOv8,YOLOX, PPYOLOE, etc.
https://mmyolo.readthedocs.io/zh_CN/dev/
GNU General Public License v3.0
2.93k stars 533 forks source link

Evaluation: DefualtCPUAllocator, Cannot allocate memory #498

Closed yzbx closed 1 year ago

yzbx commented 1 year ago

Prerequisite

🐞 Describe the bug

02/01 00:08:50 - mmengine - INFO - Epoch(val)  [15][ 9900/10000]    eta: 0:00:03  time: 0.0330  data_time: 0.0096  memory: 295                                                                               
02/01 00:08:51 - mmengine - INFO - Epoch(val)  [15][ 9950/10000]    eta: 0:00:01  time: 0.0338  data_time: 0.0002  memory: 2347                                                                              
02/01 00:08:54 - mmengine - INFO - Epoch(val)  [15][10000/10000]    eta: 0:00:00  time: 0.0470  data_time: 0.0101  memory: 2346                                                                              
02/01 00:16:47 - mmengine - INFO - Evaluating bbox...                                                                                                                                                        
Loading and preparing results...                                                                                                                                                                             
DONE (t=98.38s)                                                                                                                                                                                              
creating index...                                                                                                                                                                                            
index created!                                                                                                                                                                                               
Running per image evaluation...                                                                                                                                                                              
Evaluate annotation type *bbox*                                                                                                                                                                              
[E ProcessGroupNCCL.cpp:719] [Rank 4] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=15926398, OpType=BROADCAST, Timeout(ms)=1800000) ran for 1808698 milliseconds before timing out.         
[E ProcessGroupNCCL.cpp:719] [Rank 1] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=15926398, OpType=BROADCAST, Timeout(ms)=1800000) ran for 1808687 milliseconds before timing out.         
[E ProcessGroupNCCL.cpp:719] [Rank 6] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=15926398, OpType=BROADCAST, Timeout(ms)=1800000) ran for 1808685 milliseconds before timing out.         
[E ProcessGroupNCCL.cpp:719] [Rank 7] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=15926398, OpType=BROADCAST, Timeout(ms)=1800000) ran for 1808732 milliseconds before timing out.         
[E ProcessGroupNCCL.cpp:719] [Rank 5] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=15926398, OpType=BROADCAST, Timeout(ms)=1800000) ran for 1808753 milliseconds before timing out.         
[E ProcessGroupNCCL.cpp:719] [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=15926398, OpType=BROADCAST, Timeout(ms)=1800000) ran for 1808873 milliseconds before timing out.         
Traceback (most recent call last):                                                                                                                                                                           
  File "tools/train.py", line 116, in <module>                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
  File "/opt/conda/lib/python3.8/site-packages/mmengine/runner/loops.py", line 350, in run                                                                                                                   
    _results = metric.evaluate(size)                                                                                                 
File "/opt/conda/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run                                                                                                                    
    _results = metric.evaluate(size)
File "/opt/conda/lib/python3.8/site-packages/mmengine/evaluator/metric.py", line 121, in evaluate
    torch_dist.broadcast_object_list(data, src, group)
[enforce fail at alloc_cpu.cpp:73] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 4253849738992253942 bytes. Error code 12 (Cannot allocate memory)

Environment

sys.platform: linux
Python: 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0]
CUDA available: True
numpy_random_seed: 2147483648
GPU 0,1,2,3,4,5,6,7: NVIDIA GeForce GTX 1080 Ti
CUDA_HOME: None
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.11.0
PyTorch compiling details: PyTorch built with:

TorchVision: 0.12.0
OpenCV: 4.7.0
MMEngine: 0.5.0
MMCV: 2.0.0rc3
MMDetection: 3.0.0rc5
MMYOLO: 0.4.0+

Additional information

default_scope = 'mmyolo'                                                                                                                                                                                     
default_hooks = dict(                                                                                                                                                                                        
    timer=dict(type='IterTimerHook'),                                                                                                                                                                        
    logger=dict(type='LoggerHook', interval=50),                                                                                                                                                             
    param_scheduler=dict(                                                                                                                                                                                    
        type='YOLOv5ParamSchedulerHook',                                                                                                                                                                     
        scheduler_type='linear',                                                                                                                                                                             
        lr_factor=0.01,                                                                                                                                                                                      
        max_epochs=100),                                                                                                                                                                                     
    checkpoint=dict(                                                                                                                                                                                         
        type='CheckpointHook', interval=5, save_best='auto', max_keep_ckpts=3),                                                                                                                              
    sampler_seed=dict(type='DistSamplerSeedHook'),                                                                                                                                                           
    visualization=dict(type='mmdet.DetVisualizationHook'))                                                                                                                                                   
env_cfg = dict(                                                                                                                                                                                              
    cudnn_benchmark=True,                                                                                                                                                                                    
    mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),                                                                                                                                               
    dist_cfg=dict(backend='nccl'))                                                                                                                                                                           
vis_backends = [dict(type='LocalVisBackend')]                                                                                                                                                                
visualizer = dict(                                                                                                                                                                                           
    type='mmdet.DetLocalVisualizer',                                                                                                                                                                         
    vis_backends=[dict(type='LocalVisBackend')],                                                                                                                                                             
    name='visualizer')                                                                                                                                                                                       
log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)                                                                                                                                     
log_level = 'INFO'                                                                                                                                                                                           
load_from = None                                                                                                                                                                                             
resume = True                                                                                                                                                                                                
file_client_args = dict(backend='disk')                                                                                                                                                                      
data_root = '/data/wangjiaxin/object365/'                                                                                                                                                                    
dataset_type = 'YOLOv5CocoDataset'                                                                                                                                                                           
num_classes = 365                                                                                                                                                                                            
img_scale = (640, 640)                                                                                                                                                                                       
deepen_factor = 0.33                                                                                                                                                                                         
widen_factor = 0.5                                                                                                                                                                                           
max_epochs = 100                                                                                                                                                                                             
save_epoch_intervals = 5                                                                                                                                                                                     
train_batch_size_per_gpu = 8
train_num_workers = 8                                                                                                                                                                                        
val_batch_size_per_gpu = 1                                                                                                                                                                                   
val_num_workers = 2                                                                                                                                                                                          
persistent_workers = True                                                                                                                                                                                    
base_lr = 0.05                                                                                                                                                                                               
metainfo = dict(classes=[                                                                                                                                                                                    
    'Person', 'Sneakers', 'Chair', 'Other Shoes', 'Hat', 'Car', 'Lamp',                                                                                                                                      
    'Glasses', 'Bottle', 'Desk', 'Cup', 'Street Lights', 'Cabinet/shelf',                                                                                                                                    
    'Handbag/Satchel', 'Bracelet', 'Plate', 'Picture/Frame', 'Helmet', 'Book', ... ])
batch_shapes_cfg = dict(                                                                                                                                                                                     
    type='BatchShapePolicy',                                                                                                                                                                                 
    batch_size=1,                                                                                                                                                                                            
    img_size=640,                                                                                                                                                                                            
    size_divisor=32,                                                                                                                                                                                         
    extra_pad_ratio=0.5)                                                                                                                                                                                     
anchors = [[(10, 13), (16, 30), (33, 23)], [(30, 61), (62, 45), (59, 119)],                                                                                                                                  
           [(116, 90), (156, 198), (373, 326)]]                                                                                                                                                              
strides = [8, 16, 32]                                                                                                                                                                                        
num_det_layers = 3                                                                                                                                                                                           
model = dict(                                                                                                                                                                                                
    type='YOLODetector',                                                                                                                                                                                     
    data_preprocessor=dict(                                                                                                                                                                                  
        type='mmdet.DetDataPreprocessor',                                                                                                                                                                    
        mean=[0.0, 0.0, 0.0],                                                                                                                                                                                
        std=[255.0, 255.0, 255.0],                                                                                                                                                                           
        bgr_to_rgb=True),                                                                                                                                                                                    
    backbone=dict(                                                                                                                                                                                           
        type='YOLOv5CSPDarknet',                                                                                                                                                                             
        deepen_factor=0.33,                                                                                                                                                                                  
        widen_factor=0.5,                                                                                                                                                                                    
        norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),                                                                                                                                                  
        act_cfg=dict(type='SiLU', inplace=True)),                                                                                                                                                            
    neck=dict(                                                                                                                                                                                               
        type='YOLOv5PAFPN',                                                                                                                                                                                  
        deepen_factor=0.33,                                                                                                                                                                                  
        widen_factor=0.5,                                                                                                                                                                                    
        in_channels=[256, 512, 1024],                                                                                                                                                                        
        out_channels=[256, 512, 1024],                                                                                                                                                                       
        num_csp_blocks=3,                                                                                                                                                                                    
        norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),                                                                                                                                                  
        act_cfg=dict(type='SiLU', inplace=True)),        
    bbox_head=dict(                                                                                                                                                                                          
        type='YOLOv5Head',                                                                                                                                                                                   
        head_module=dict(                                                                                                                                                                                    
            type='YOLOv5HeadModule',                                                                                                                                                                         
            num_classes=365,                                                                                                                                                                                 
            in_channels=[256, 512, 1024],                                                                                                                                                                    
            widen_factor=0.5,                                                                                                                                                                                
            featmap_strides=[8, 16, 32],                                                                                                                                                                     
            num_base_priors=3),                                                                                                                                                                              
        prior_generator=dict(                                                                                                                                                                                
            type='mmdet.YOLOAnchorGenerator',                                                                                                                                                                
            base_sizes=[[(10, 13), (16, 30), (33, 23)],                                                                                                                                                      
                        [(30, 61), (62, 45), (59, 119)],                                                                                                                                                     
                        [(116, 90), (156, 198), (373, 326)]],                                                                                                                                                
            strides=[8, 16, 32]),                                                                                                                                                                            
        loss_cls=dict(                                                                                                                                                                                       
            type='mmdet.CrossEntropyLoss',                                                                                                                                                                   
            use_sigmoid=True,                                                                                                                                                                                
            reduction='mean',                                                                                                                                                                                
            loss_weight=2.28125),                                                                                                                                                                            
        loss_bbox=dict(                                                                                                                                                                                      
            type='IoULoss',                                                                                                                                                                                  
            iou_mode='ciou',                                                                                                                                                                                 
            bbox_format='xywh',                                                                                                                                                                              
            eps=1e-07,                                                                                                                                                                                       
            reduction='mean',                                                                                                                                                                                
            loss_weight=0.05,                                                                                                                                                                                
            return_iou=True),                                                                                                                                                                                
        loss_obj=dict(                                                                                                                                                                                       
            type='mmdet.CrossEntropyLoss',                                                                                                                                                                   
            use_sigmoid=True,                                                                                                                                                                                
            reduction='mean',                                                                                                                                                                                
            loss_weight=1.0),                                                                                                                                                                                
        prior_match_thr=4.0,                                                                                                                                                                                 
        obj_level_weights=[4.0, 1.0, 0.4]),                                                                                                                                                                  
    test_cfg=dict(                                                                                                                                                                                           
        multi_label=True,                                                                                                                                                                                    
        nms_pre=30000,                                                                                                                                                                                       
        score_thr=0.001,                                                                                                                                                                                     
        nms=dict(type='nms', iou_threshold=0.65),                                                                                                                                                            
        max_per_img=300))                                                                                                                                                                                    
albu_train_transforms = [                                                                                                                                                                                    
    dict(type='Blur', p=0.01),                                                                                                                                                                               
    dict(type='MedianBlur', p=0.01),                                                                                                                                                                         
    dict(type='ToGray', p=0.01),                                                                                                                                                                             
    dict(type='CLAHE', p=0.01)                                                                                                                                                                               
]                                                                                                                                                                                                            
pre_transform = [                                                                                                                                                                                            
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),                                                                                                                                   
    dict(type='LoadAnnotations', with_bbox=True)                                                                                                                                                             
]                                                        
train_pipeline = [                                                                                                                                                                                           
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),                                                                                                                                   
    dict(type='LoadAnnotations', with_bbox=True),                                                                                                                                                            
    dict(                                                                                                                                                                                                    
        type='Mosaic',                                                                                                                                                                                       
        img_scale=(640, 640),                                                                                                                                                                                
        pad_val=114.0,                                                                                                                                                                                       
        pre_transform=[                                                                                                                                                                                      
            dict(                                                                                                                                                                                            
                type='LoadImageFromFile',                                                                                                                                                                    
                file_client_args=dict(backend='disk')),                                                                                                                                                      
            dict(type='LoadAnnotations', with_bbox=True)                                                                                                                                                     
        ]),                                                                                                                                                                                                  
    dict(                                                                                                                                                                                                    
        type='YOLOv5RandomAffine',                                                                                                                                                                           
        max_rotate_degree=0.0,                                                                                                                                                                               
        max_shear_degree=0.0,                                                                                                                                                                                
        scaling_ratio_range=(0.5, 1.5),                                                                                                                                                                      
        border=(-320, -320),                                                                                                                                                                                 
        border_val=(114, 114, 114)),                                                                                                                                                                         
    dict(                                                                                                                                                                                                    
        type='mmdet.Albu',                                                                                                                                                                                   
        transforms=[                                                                                                                                                                                         
            dict(type='Blur', p=0.01),                                                                                                                                                                       
            dict(type='MedianBlur', p=0.01),                                                                                                                                                                 
            dict(type='ToGray', p=0.01),                                                                                                                                                                     
            dict(type='CLAHE', p=0.01)                                                                                                                                                                       
        ],                                                                                                                                                                                                   
        bbox_params=dict(                                                                                                                                                                                    
            type='BboxParams',                                                                                                                                                                               
            format='pascal_voc',                                                                                                                                                                             
            label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),                                                                                                                                           
        keymap=dict(img='image', gt_bboxes='bboxes')),                                                                                                                                                       
    dict(type='YOLOv5HSVRandomAug'),                                                                                                                                                                         
    dict(type='mmdet.RandomFlip', prob=0.5),                                                                                                                                                                 
    dict(                                                                                                                                                                                                    
        type='mmdet.PackDetInputs',                                                                                                                                                                          
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',                                                                                                                                   
                   'flip_direction'))                                                                                                                                                                        
]                                                                                                                                                                                                            
train_dataloader = dict(                                                                                                                                                                                     
    batch_size=8,                                                                                                                                                                                            
    num_workers=8,                                                                                                                                                                                           
    persistent_workers=True,                                                                                                                                                                                 
    pin_memory=True,                                                                                                                                                                                         
    sampler=dict(type='DefaultSampler', shuffle=True),                                                                                                                                                       
    dataset=dict(                                                                                                                                                                                            
        type='YOLOv5CocoDataset',                                                                                                                                                                            
        data_root='/data/wangjiaxin/object365/',                                                                                                                                                             
        ann_file='yolov5_objv2_train.json',                                                                                                                                                                  
        data_prefix=dict(img='images/train/'),       
        ann_file='yolov5_objv2_train.json',                                                                                                                                                        [334/1831]
        data_prefix=dict(img='images/train/'),                                                                                                                                                               
        metainfo=dict(classes=[                                                                                                                                                                              
            'Person', 'Sneakers', 'Chair', 'Other Shoes', 'Hat', 'Car', 'Lamp',                                                                                                                              
            'Glasses', 'Bottle', 'Desk', 'Cup', 'Street Lights',                                                                                                                                             
            'Cabinet/shelf', 'Handbag/Satchel', 'Bracelet', 'Plate', ... ]),
        filter_cfg=dict(filter_empty_gt=False, min_size=32),                                                                                                                                                 
        pipeline=[                                                                                                                                                                                           
            dict(                                                                                                                                                                                            
                type='LoadImageFromFile',                                                                                                                                                                    
                file_client_args=dict(backend='disk')),                                                                                                                                                      
            dict(type='LoadAnnotations', with_bbox=True),                                                                                                                                                    
            dict(                                                                                                                                                                                            
                type='Mosaic',                                                                                                                                                                               
                img_scale=(640, 640),                                                                                                                                                                        
                pad_val=114.0,                                                                                                                                                                               
                pre_transform=[                                                                                                                                                                              
                    dict(                                                                                                                                                                                    
                        type='LoadImageFromFile',                                                                                                                                                            
                        file_client_args=dict(backend='disk')),                                                                                                                                              
                    dict(type='LoadAnnotations', with_bbox=True)                                                                                                                                             
                ]),                                                                                                                                                                                          
            dict(                                                                                                                                                                                            
                type='YOLOv5RandomAffine',                                                                                                                                                                   
                max_rotate_degree=0.0,                                                                                                                                                                       
                max_shear_degree=0.0,                                                                                                                                                                        
                scaling_ratio_range=(0.5, 1.5),                                                                                                                                                              
                border=(-320, -320),                                                                                                                                                                         
                border_val=(114, 114, 114)),                                                                                                                                                                 
            dict(                                                                                                                                                                                            
                type='mmdet.Albu',                                                                                                                                                                           
                transforms=[                                                                                                                                                                                 
                    dict(type='Blur', p=0.01),                                                                                                                                                               
                    dict(type='MedianBlur', p=0.01),    
                    dict(type='ToGray', p=0.01),                                                                                                                                                             
                    dict(type='CLAHE', p=0.01)                                                                                                                                                               
                ],                                                                                                                                                                                           
                bbox_params=dict(                                                                                                                                                                            
                    type='BboxParams',                                                                                                                                                                       
                    format='pascal_voc',                                                                                                                                                                     
                    label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),                                                                                                                                   
                keymap=dict(img='image', gt_bboxes='bboxes')),                                                                                                                                               
            dict(type='YOLOv5HSVRandomAug'),                                                                                                                                                                 
            dict(type='mmdet.RandomFlip', prob=0.5),                                                                                                                                                         
            dict(                                                                                                                                                                                            
                type='mmdet.PackDetInputs',                                                                                                                                                                  
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',                                                                                                                                   
                           'flip', 'flip_direction'))                                                                                                                                                        
        ]))                                                                                                                                                                                                  
test_pipeline = [                                                                                                                                                                                            
    dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),                                                                                                                                   
    dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),                                                                                                                                                    
    dict(                                                                                                                                                                                                    
        type='LetterResize',                                                                                                                                                                                 
        scale=(640, 640),                                                                                                                                                                                    
        allow_scale_up=False,                                                                                                                                                                                
        pad_val=dict(img=114)),                                                                                                                                                                              
    dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),                                                                                                                                           
    dict(                                                                                                                                                                                                    
        type='mmdet.PackDetInputs',                                                                                                                                                                          
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',                                                                                                                                           
                   'scale_factor', 'pad_param'))                                                                                                                                                             
]                                                                                                                                                                                                            
val_dataloader = dict(                                                                                                                                                                                       
    batch_size=1,                                                                                                                                                                                            
    num_workers=2,                                                                                                                                                                                           
    persistent_workers=True,                                                                                                                                                                                 
    pin_memory=True,                                                                                                                                                                                         
    drop_last=False,                                                                                                                                                                                         
    sampler=dict(type='DefaultSampler', shuffle=False),                                                                                                                                                      
    dataset=dict(                                                                                                                                                                                            
        type='YOLOv5CocoDataset',                                                                                                                                                                            
        data_root='/data/wangjiaxin/object365/',                                                                                                                                                             
        test_mode=True,                                                                                                                                                                                      
        data_prefix=dict(img='images/val/'),                                                                                                                                                                 
        metainfo=dict(classes=[                                                                                                                                                                              
            'Person', 'Sneakers', 'Chair', 'Other Shoes', 'Hat', 'Car', 'Lamp',                                                                                                                              
            'Glasses', 'Bottle', 'Desk', 'Cup', 'Street Lights',                                                                                                                                             
            'Cabinet/shelf', 'Handbag/Satchel', 'Bracelet', 'Plate', ... ]),
        ann_file='yolov5_objv2_val.json',                                                                                                                                                                    
        pipeline=[                                                                                                                                                                                           
            dict(                                                                                                                                                                                            
                type='LoadImageFromFile',                                                                                                                                                                    
                file_client_args=dict(backend='disk')),                                                                                                                                                      
            dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),                                                                                                                                            
            dict(                                                                                                                                                                                            
                type='LetterResize',                                                                                                                                                                         
                scale=(640, 640),                                                                                                                                                                            
                allow_scale_up=False,                                                                                                                                                                        
                pad_val=dict(img=114)),                                                                                                                                                                      
            dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),                                                                                                                                   
            dict(                                                                                                                                                                                            
                type='mmdet.PackDetInputs',                                                                                                                                                                  
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',                                                                                                                                   
                           'scale_factor', 'pad_param'))                                                                                                                                                     
        ],                                                                                                                                                                                                   
        batch_shapes_cfg=dict(                                                                                                                                                                               
            type='BatchShapePolicy',                                                                                                                                                                         
            batch_size=1,                                                                                                                                                                                    
            img_size=640,                                                                                                                                                                                    
            size_divisor=32,                                                                                                                                                                                 
            extra_pad_ratio=0.5)))                                                                                                                                                                           
test_dataloader = dict(                                                                                                                                                                                      
    batch_size=1,                                                                                                                                                                                            
    num_workers=2,                                                                                                                                                                                           
    persistent_workers=True,                                                                                                                                                                                 
    pin_memory=True,                                                                                                                                                                                         
    drop_last=False,                                                                                                                                                                                         
    sampler=dict(type='DefaultSampler', shuffle=False),                                                                                                                                                      
    dataset=dict(                                                                                                                                                                                            
        type='YOLOv5CocoDataset',                                                                                                                                                                            
        data_root='/data/wangjiaxin/object365/',                                                                                                                                                             
        test_mode=True,                                                                                                                                                                                      
        data_prefix=dict(img='images/val/'),                                                                                                                                                                 
        metainfo=dict(classes=[                                                                                                                                                                              
            'Person', 'Sneakers', 'Chair', 'Other Shoes', 'Hat', 'Car', 'Lamp',                                                                                                                              
            'Glasses', 'Bottle', 'Desk', 'Cup', 'Street Lights', ... ]),
        ann_file='yolov5_objv2_val.json',                                                                                                                                                                    
        pipeline=[                                                                                                                                                                                           
            dict(                                                                                                                                                                                            
                type='LoadImageFromFile',                                                                                                                                                                    
                file_client_args=dict(backend='disk')),                                                                                                                                                      
            dict(type='YOLOv5KeepRatioResize', scale=(640, 640)),                                                                                                                                            
            dict(                                                                                                                                                                                            
                type='LetterResize',                                                                                                                                                                         
                scale=(640, 640),                                                                                                                                                                            
                allow_scale_up=False,                                                                                                                                                                        
                pad_val=dict(img=114)),                                                                                                                                                                      
            dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),                                                                                                                                   
            dict(                                                                                                                                                                                            
                type='mmdet.PackDetInputs',                                                                                                                                                                  
                meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',                                                                                                                                   
                           'scale_factor', 'pad_param'))                                                                                                                                                     
        ],                                                                                                                                                                                                   
        batch_shapes_cfg=dict(                                                                                                                                                                               
            type='BatchShapePolicy',                                                                                                                                                                         
            batch_size=1,                                                                                                                                                                                    
            img_size=640,                                                                                                                                                                                    
            size_divisor=32,                                                                                                                                                                                 
            extra_pad_ratio=0.5)))                                                                                                                                                                           
param_scheduler = None                                                                                                                                                                                       
optim_wrapper = dict(                                                                                                                                                                                        
    type='OptimWrapper',                                                                                                                                                                                     
    optimizer=dict(                                                                                                                                                                                          
        type='SGD',                                                                                                                                                                                          
        lr=0.05,                                                                                                                                                                                             
        momentum=0.937,                                                                                                                                                                                      
        weight_decay=0.0005,       
        nesterov=True,                                                                                                                                                                                       
        batch_size_per_gpu=8),                                                                                                                                                                               
    constructor='YOLOv5OptimizerConstructor')                                                                                                                                                                
custom_hooks = [                                                                                                                                                                                             
    dict(                                                                                                                                                                                                    
        type='EMAHook',                                                                                                                                                                                      
        ema_type='ExpMomentumEMA',                                                                                                                                                                           
        momentum=0.0001,                                                                                                                                                                                     
        update_buffers=True,                                                                                                                                                                                 
        strict_load=False,                                                                                                                                                                                   
        priority=49)                                                                                                                                                                                         
]                                                                                                                                                                                                            
val_evaluator = dict(                                                                                                                                                                                        
    type='mmdet.CocoMetric',                                                                                                                                                                                 
    proposal_nums=(100, 1, 10),                                                                                                                                                                              
    ann_file='/data/wangjiaxin/object365/yolov5_objv2_val.json',                                                                                                                                             
    metric='bbox')                                                                                                                                                                                           
test_evaluator = dict(                                                                                                                                                                                       
    type='mmdet.CocoMetric',                                                                                                                                                                                 
    proposal_nums=(100, 1, 10),                                                                                                                                                                              
    ann_file='/data/wangjiaxin/object365/yolov5_objv2_val.json',                                                                                                                                             
    metric='bbox')                                                                                                                                                                                           
train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=100, val_interval=5)                                                                                                                                 
val_cfg = dict(type='ValLoop')                                                                                                                                                                               
test_cfg = dict(type='TestLoop')                                                                                                                                                                             
launcher = 'pytorch'                                                                                                                                                                                         
work_dir = './work_dirs/yolov5_s-v61_syncbn_8xb16-100e_object365v2'
hhaAndroid commented 1 year ago

@yzbx Thank you very much for your feedback, we will look into it. I'll let you know if there's any progress.

hhaAndroid commented 1 year ago

@yzbx This is because the GPU intermediate data self.results of the multi-GPU will be synchronized on the CPU, and your evaluation data set is relatively large, resulting in insufficient CPU memory. A simple solution is to reduce the threshold parameter of test_cfg, such as

test_cfg=dict(
        multi_label=True,
        nms_pre=30000,
        score_thr=0.001, -> 0.1
        nms=dict(type='nms', iou_threshold=0.65),
        max_per_img=300)) -> 100
yzbx commented 1 year ago

@hhaAndroid thanks.

hhaAndroid commented 1 year ago

@yzbx There is an additional reference option https://github.com/ultralytics/yolov3/issues/796