Open nobody-cheng opened 1 year ago
similar problem here. Have you resolved it? @nobody-cheng
遇到同样的问题
我也遇到相同问题 mmdetection 和 mmyolo都有相同的问题
I have similar promblem when I use mmcv and mmdetection in other project. It might cause from mmdetection.
有人解决了吗
同样的问题,但是我的GPU没有被调用,内存占用持续增大,我参考v5对比模型的定义,模型定义好像没有出现问题,猜测应该是其他模块的问题
Prerequisite
🐞 Describe the bug
训练过程中内存及显存会一直涨,直到溢出出错
Environment
System environment: sys.platform: linux Python: 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0] CUDA available: True numpy_random_seed: 1556182265 GPU 0,1,2,3: NVIDIA GeForce RTX 4090 CUDA_HOME: /usr/local/cuda-12.1 NVCC: Cuda compilation tools, release 12.1, V12.1.105 GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 PyTorch: 1.13.1+cu116 PyTorch compiling details: PyTorch built with:
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.6, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,
TorchVision: 0.14.1+cu116 OpenCV: 4.8.0 MMEngine: 0.8.4
Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: 1556182265 Distributed launcher: none Distributed training: False GPU number: 1
base = ['../base/default_runtime.py', '../base/det_p5_tta.py'] data_root = './data/CrowdHuman2coco/' dataset_type = 'YOLOv5CocoDataset'
class_name = ('head', 'person', ) num_classes = len(class_name) metainfo = dict(classes=class_name, palette=[(220, 20, 60), (220, 100, 128)])
img_scale = (640, 640)
deepen_factor = 0.33 widen_factor = 0.5
num_last_epochs = 5 img_scales = [ ( 640, 640, ), ( 320, 320, ), ] max_epochs = 80 save_epoch_intervals = 5 train_batch_size_per_gpu = 12 train_num_workers = 2 val_batch_size_per_gpu = 1 val_num_workers = 2
load_from = 'https://download.openmmlab.com/mmyolo/v0/ppyoloe/ppyoloe_pretrain/ppyoloe_plus_s_obj365_pretrained-bcfe8478.pth'
persistent_workers = True base_lr = 0.001
strides = [8, 16, 32]
model = dict( type='YOLODetector', data_preprocessor=dict(
use this to support multi_scale training
train_pipeline = [ dict(type='LoadImageFromFile', backend_args=base.backend_args), dict(type='LoadAnnotations', with_bbox=True), dict(type='PPYOLOERandomDistort'), dict(type='mmdet.Expand', mean=(103.53, 116.28, 123.675)), dict(type='PPYOLOERandomCrop'), dict(type='mmdet.RandomFlip', prob=0.5), dict( type='mmdet.PackDetInputs', meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip', 'flip_direction')) ]
train_dataloader = dict( batch_size=train_batch_size_per_gpu, num_workers=train_num_workers, persistent_workers=persistent_workers, pin_memory=True, sampler=dict(type='DefaultSampler', shuffle=True), collate_fn=dict(type='yolov5_collate', use_ms_training=True), dataset=dict( type=dataset_type, data_root=data_root, metainfo=metainfo, ann_file='annotations/train.json', data_prefix=dict(img='train/'), filter_cfg=dict(filter_empty_gt=True, min_size=0), pipeline=train_pipeline))
test_pipeline = [ dict(type='LoadImageFromFile', backend_args=base.backend_args), dict( type='mmdet.FixShapeResize', width=img_scale[0], height=img_scale[1], keep_ratio=False, interpolation='bicubic'), dict(type='LoadAnnotations', with_bbox=True, scope='mmdet'), dict( type='mmdet.PackDetInputs', meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'scale_factor')) ]
val_dataloader = dict( batch_size=val_batch_size_per_gpu, num_workers=val_num_workers, persistent_workers=persistent_workers, pin_memory=True, drop_last=False, sampler=dict(type='DefaultSampler', shuffle=False), dataset=dict( type=dataset_type, data_root=data_root, metainfo=metainfo, test_mode=True, data_prefix=dict(img='val/'), filter_cfg=dict(filter_empty_gt=True, min_size=0), ann_file='annotations/val.json', pipeline=test_pipeline))
test_dataloader = val_dataloader
param_scheduler = None optim_wrapper = dict( type='OptimWrapper', optimizer=dict( type='SGD', lr=base_lr, momentum=0.9, weight_decay=5e-4, nesterov=False), paramwise_cfg=dict(norm_decay_mult=0.))
default_hooks = dict( param_scheduler=dict( type='PPYOLOEParamSchedulerHook', warmup_min_iter=1000, start_factor=0., warmup_epochs=5, min_lr_ratio=0.0, total_epochs=int(max_epochs * 1.2)), checkpoint=dict( type='CheckpointHook', interval=save_epoch_intervals, save_best='auto', max_keep_ckpts=3))
custom_hooks = [ dict( type='EMAHook', ema_type='ExpMomentumEMA', momentum=0.0002, update_buffers=True, strict_load=False, priority=49) ]
val_evaluator = dict( type='mmdet.CocoMetric', proposal_nums=(100, 1, 10), ann_file=data_root + 'annotations/val.json', metric='bbox') test_evaluator = val_evaluator
train_cfg = dict( type='EpochBasedTrainLoop', max_epochs=max_epochs, val_interval=save_epoch_intervals) val_cfg = dict(type='ValLoop') test_cfg = dict(type='TestLoop')