Open YAwei666 opened 3 months ago
it seems only 2 cpu kernels works. 怎么回事呢
This is strange. Based on your information, it can be seen that you have successfully started 12 num_workers
. However, only two CPU threads are occupied. Does your server have virtualization technology enabled, which may result in you only being able to use two CPU threads
Branch
main branch (mmpretrain version)
Describe the bug
python tools/train.py configs/resnet/resnet50_8xb32_in1k_2.py base = [ '../base/models/resnet50.py', '../base/datasets/imagenet_bs32.py', '../base/schedules/imagenet_bs256_coslr.py', '../base/default_runtime.py' ] model = dict( backbone=dict( frozen_stages=2, init_cfg=dict( type='Pretrained', checkpoint='https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_8xb32_in1k_20210831-ea4938fc.pth', prefix='backbone', )), head=dict(num_classes=5), )
>>>>>>>>>>>>>>> 在这里重载数据配置 >>>>>>>>>>>>>>>>>>>
data_root = '/mnt/data//dataset' train_dataloader = dict( batch_size=192, dataset=dict( type='CustomDataset', data_root=data_root, ann_file='meta/train.txt', # 我们假定使用子文件夹格式,因此需要将标注文件置空 data_prefix='', )) val_dataloader = dict( batch_size=192, dataset=dict( type='CustomDataset', data_root=data_root, ann_file='meta/test.txt', # 我们假定使用子文件夹格式,因此需要将标注文件置空 data_prefix='', )) test_dataloader = val_dataloader
optim_wrapper = dict( optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
学习率策略
param_scheduler = dict(
type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)
train, val, test setting
train_cfg = dict(by_epoch=True, max_epochs=30, val_interval=1)
'../base/models/resnet50.py'
model settings
model = dict( type='ImageClassifier', backbone=dict( type='ResNeSt', depth=50, num_stages=4, out_indices=(3, ), style='pytorch'), neck=dict(type='GlobalAveragePooling'), head=dict( type='LinearClsHead', num_classes=1000, in_channels=2048, loss=dict( type='LabelSmoothLoss', label_smooth_val=0.1, num_classes=1000, reduction='mean', loss_weight=1.0), topk=(1, 5), cal_acc=False), train_cfg=dict(augments=dict(type='Mixup', alpha=0.2)), )
'../base/datasets/imagenet_bs32.py'
dataset settings
dataset_type = 'ImageNet' data_preprocessor = dict( num_classes=1000,
RGB format normalization parameters
)
train_pipeline = [ dict(type='LoadImageFromFile',imdecode_backend='pillow' ), dict(type='RandomResizedCrop', scale=224), dict(type='RandomFlip', prob=0.5, direction='horizontal'), dict(type='PackInputs'), ]
test_pipeline = [ dict(type='LoadImageFromFile',imdecode_backend='pillow'), dict(type='ResizeEdge', scale=256, edge='short'), dict(type='CenterCrop', crop_size=224), dict(type='PackInputs'), ]
train_dataloader = dict( batch_size=128, num_workers=12, dataset=dict( type=dataset_type, data_root='data/imagenet', pipeline=train_pipeline), sampler=dict(type='DefaultSampler', shuffle=True), )
val_dataloader = dict( batch_size=128, num_workers=12, dataset=dict( type=dataset_type, data_root='data/imagenet', pipeline=test_pipeline), sampler=dict(type='DefaultSampler', shuffle=False), ) val_evaluator = dict(type='Accuracy', topk=(1))
If you want standard test, please manually configure the test dataset
test_dataloader = val_dataloader test_evaluator = val_evaluator
'../base/schedules/imagenet_bs256_coslr.py',
optimizer
optim_wrapper = dict( optimizer=dict(type='SGD', lr=0.8, momentum=0.9, weight_decay=5e-5))
learning policy
param_scheduler = [ dict(type='LinearLR', start_factor=0.1, by_epoch=True, begin=0, end=5), dict(type='CosineAnnealingLR', T_max=95, by_epoch=True, begin=5, end=100) ]
train, val, test setting
train_cfg = dict(by_epoch=True, max_epochs=100, val_interval=1) val_cfg = dict() test_cfg = dict()
NOTE:
auto_scale_lr
is for automatically scaling LR,based on the actual training batch size.
auto_scale_lr = dict(base_batch_size=1024)
lr: 1.0000e-02 eta: 17:43:57 time: 3.5791 data_time: 3.3401 memory: 4676 loss: 0.3175
Environment
{'sys.platform': 'linux', 'Python': '3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0]', 'CUDA available': True, 'MUSA available': False, 'numpy_random_seed': 2147483648, 'GPU 0': 'NVIDIA GeForce RTX 3090', 'CUDA_HOME': ':/usr/local/cuda', 'GCC': 'gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609', 'PyTorch': '1.10.1', 'TorchVision': '0.11.2', 'OpenCV': '4.10.0', 'MMEngine': '0.10.4', 'MMCV': '2.2.0', 'MMPreTrain': '1.2.0+'}
Other information
No response