Closed Lxp2014 closed 2 years ago
hello @lixiaopeng123456 , thanks for your report, this is a known issue and will be fixed in next release.
@Louis-J could you give a workaround for current nni version?
hello @lixiaopeng123456 , thanks for your report, this is a known issue and will be fixed in next release.
@Louis-J could you give a workaround for current nni version?
Thanks for your reply, looking forward to the next version.
hello @lixiaopeng123456 , thanks for your report, this is a known issue and will be fixed in next release.
@Louis-J could you give a workaround for current nni version?
@J-shang - is this target to 2.9.1 or later?
hello @lixiaopeng123456 , thanks for your report, this is a known issue and will be fixed in next release. @Louis-J could you give a workaround for current nni version?
@J-shang - is this target to 2.9.1 or later?
Hope we could solve this in 2.9.1
Does the problem only occur on gpu? I can't reproduce it on cpu in ssdlite_mobilenetv2_scratch_600e_coco.
Does the problem only occur on gpu? I can't reproduce it on cpu in ssdlite_mobilenetv2_scratch_600e_coco.
Yes, the problem occurs on gpu. I will test it on cpu and get back to you. Thanks!
Does the problem only occur on gpu? I can't reproduce it on cpu in ssdlite_mobilenetv2_scratch_600e_coco.
The problem still occurs on cpu.
test code:python demo/test.py tests/data/10.jpg configs/ssd/ssdlite_mobilenetv2_scratch_600e_hand.py work_dirs/ssdlite_mobilenetv2_scratch_600e_hand/epoch_120.pth --device cpu --score-thr 0.5 My test.py: import asyncio from argparse import ArgumentParser from functools import partial from mmdet.apis import (async_inference_detector, inference_detector, init_detector, show_result_pyplot) import torch import pdb def parse_args(): parser = ArgumentParser() parser.add_argument('img', help='Image file') parser.add_argument('config', help='Config file') parser.add_argument('checkpoint', help='Checkpoint file') parser.add_argument('--out-file', default=None, help='Path to output file') parser.add_argument( '--device', default='cuda:0', help='Device used for inference') parser.add_argument( '--palette', default='coco', choices=['coco', 'voc', 'citys', 'random'], help='Color palette used for visualization') parser.add_argument( '--score-thr', type=float, default=0.3, help='bbox score threshold') parser.add_argument( '--async-test', action='store_true', help='whether to set async options for async inference.') args = parser.parse_args() return args
def main(args):
model = init_detector(args.config, args.checkpoint, device=args.device)
# test a single image
# result = inference_detector(model, args.img)
config_list = [{
'sparsity_per_layer': 0.5,
'op_types': ['Conv2d']
}, {
'exclude': True,
'op_names': ['Linear','bn']
}]
from nni.compression.pytorch.pruning import L1NormPruner
pruner = L1NormPruner(model, config_list)
# show the wrapped model structure, `PrunerModuleWrapper` have wrapped the layers that configured in the config_list.
# print(model)
# %%
# compress the model and generate the masks
_, masks = pruner.compress()
# show the masks sparsity
for name, mask in masks.items():
print(name, ' sparsity : ', '{:.2}'.format(mask['weight'].sum() / mask['weight'].numel()))
pruner._unwrap_model()
# speedup the model, for more information about speedup, please refer :doc:`pruning_speedup`.
from nni.compression.pytorch.speedup import ModelSpeedup
print(model)
ModelSpeedup(model, torch.rand(1, 3, 256, 256), masks).speedup_model()
if name == 'main': args = parse_args() if args.async_test: asyncio.run(async_main(args)) else: main(args)
thanks, and please offer the ssdlite_mobilenetv2_scratch_600e_hand.py
.
i didn't reproduce it on ssdlite_mobilenetv2_scratch_600e_coco, so i think only on ssdlite_mobilenetv2_scratch_600e_hand it can be reproduced
i didn't reproduce it on ssdlite_mobilenetv2_scratch_600e_coco, so i think only on ssdlite_mobilenetv2_scratch_600e_hand it can be reproduced
ssdlite_mobilenetv2_scratch_600e_hand.py is as follows:
base = [ '../base/datasets/coco_detection.py', '../base/default_runtime.py' ]
model = dict( type='SingleStageDetector', backbone=dict( type='MobileNetV2', out_indices=(4, 7), norm_cfg=dict(type='BN', eps=0.001, momentum=0.03), init_cfg=dict(type='TruncNormal', layer='Conv2d', std=0.03)), neck=dict( type='SSDNeck', in_channels=(96, 1280), out_channels=(96, 1280, 512, 256, 256, 128), level_strides=(2, 2, 2, 2), level_paddings=(1, 1, 1, 1), l2_norm_scale=None, use_depthwise=True, norm_cfg=dict(type='BN', eps=0.001, momentum=0.03), act_cfg=dict(type='ReLU6'), init_cfg=dict(type='TruncNormal', layer='Conv2d', std=0.03)), bbox_head=dict( type='SSDHead', in_channels=(96, 1280, 512, 256, 256, 128), num_classes=1, use_depthwise=True, norm_cfg=dict(type='BN', eps=0.001, momentum=0.03), act_cfg=dict(type='ReLU6'), init_cfg=dict(type='Normal', layer='Conv2d', std=0.001),
anchor_generator=dict(
type='SSDAnchorGenerator',
scale_major=False,
strides=[16, 32, 64, 107, 160, 320],
ratios=[[2, 3], [2, 3], [2, 3], [2, 3], [2, 3], [2, 3]],
min_sizes=[48, 100, 150, 202, 253, 304],
max_sizes=[100, 150, 202, 253, 304, 320]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[.0, .0, .0, .0],
target_stds=[0.1, 0.1, 0.2, 0.2])),
train_cfg=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.,
ignore_iof_thr=-1,
gt_max_assign_all=False),
smoothl1_beta=1.,
allowed_border=-1,
pos_weight=-1,
neg_pos_ratio=3,
debug=False),
test_cfg=dict(
nms_pre=1000,
nms=dict(type='nms', iou_threshold=0.45),
min_bbox_size=0,
score_thr=0.02,
max_per_img=200))
cudnn_benchmark = True
dataset_type = 'CocoDataset' data_root = 'data/onehand10k/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Expand', mean=img_norm_cfg['mean'], to_rgb=img_norm_cfg['to_rgb'], ratio_range=(1, 4)), dict( type='MinIoURandomCrop', min_ious=(0.1, 0.3, 0.5, 0.7, 0.9), min_crop_size=0.3), dict(type='Resize', img_scale=(320, 320), keep_ratio=False), dict(type='RandomFlip', flip_ratio=0.5), dict( type='PhotoMetricDistortion', brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18), dict(type='Normalize', img_norm_cfg), dict(type='Pad', size_divisor=320), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']), ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(320, 320), flip=False, transforms=[ dict(type='Resize', keep_ratio=False), dict(type='Normalize', img_norm_cfg), dict(type='Pad', size_divisor=320), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=72, workers_per_gpu=4, train=dict( delete=True, type='RepeatDataset', # use RepeatDataset to speed up training times=5, dataset=dict( type=dataset_type, ann_file=data_root + 'annotations/all_train.json', img_prefix=data_root, pipeline=train_pipeline)), val=dict(pipeline=test_pipeline), test=dict(pipeline=test_pipeline))
optimizer = dict(type='SGD', lr=0.015, momentum=0.9, weight_decay=4.0e-5) optimizer_config = dict(grad_clip=None)
lr_config = dict( policy='CosineAnnealing', warmup='linear', warmup_iters=500, warmup_ratio=0.001, min_lr=0) runner = dict(type='EpochBasedRunner', max_epochs=120)
evaluation = dict(interval=5, metric='bbox') checkpoint_config = dict(interval=5) custom_hooks = [ dict(type='NumClassCheckHook'), dict(type='CheckInvalidLossHook', interval=50, priority='VERY_LOW') ]
auto_scale_lr = dict(base_batch_size=192)
thanks. i'll try it
i didn't reproduce it on ssdlite_mobilenetv2_scratch_600e_coco,
Thanks, I will also test it on ssdlite_mobilenetv2_scratch_600e_coco. If it is ok, i will check the difference.
thanks. i'll try it
The problem occurs on ssdlite_mobilenetv2_scratch_600e_coco. Can you provide your code and environment? I wonder if there is something wrong with my test code. Thanks!
sorry, I can't reproduce the issue both on cpu and gpu in both ssdlite_mobilenetv2_scratch_600e_coco and ssdlite_mobilenetv2_scratch_600e_hhand.
I think the point difference comes comes from elsewhere. could you please add the code below and show me the result?
code:
print('type(model.backbone.conv1):', type(model.backbone.conv1))
print('model.backbone.conv1:', model.backbone.conv1)
conv1_in_dummy = torch.randn(8,3,256,256)
conv1_out_dummy = model.backbone.conv1(conv1_in_dummy)
print('conv1_out_dummy.shape:', conv1_out_dummy.shape)
traced_conv1 = torch.jit.trace(model.backbone.conv1, conv1_in_dummy)
print('traced_conv1.graph:', traced_conv1.graph)
torch._C._jit_pass_inline(traced_conv1.graph)
print('traced_conv1.graph after inline:', traced_conv1.graph)
position: between pruner._unwrap_model()
and ModelSpeedup(model, torch.rand(1, 3, 256, 256), masks).speedup_model()
what I got is a torchscript graph without 'aten::to' in model.backbone.conv1
. I want to know which layer the 'aten::to' comes from.
thanks.
sorry, I can't reproduce the issue both on cpu and gpu in both ssdlite_mobilenetv2_scratch_600e_coco and ssdlite_mobilenetv2_scratch_600e_hhand.
I think the point difference comes comes from elsewhere. could you please add the code below and show me the result?
code:
print('type(model.backbone.conv1):', type(model.backbone.conv1)) print('model.backbone.conv1:', model.backbone.conv1) conv1_in_dummy = torch.randn(8,3,256,256) conv1_out_dummy = model.backbone.conv1(conv1_in_dummy) print('conv1_out_dummy.shape:', conv1_out_dummy.shape) traced_conv1 = torch.jit.trace(model.backbone.conv1, conv1_in_dummy) print('traced_conv1.graph:', traced_conv1.graph) torch._C._jit_pass_inline(traced_conv1.graph) print('traced_conv1.graph after inline:', traced_conv1.graph)
position: between
pruner._unwrap_model()
andModelSpeedup(model, torch.rand(1, 3, 256, 256), masks).speedup_model()
what I got is a torchscript graph without 'aten::to' in
model.backbone.conv1
. I want to know which layer the 'aten::to' comes from.thanks.
thanks a lot.
the graph code of model.backbone.conv1 is totally equal to my graph code. the bad 'aten::to' isn't here, still don't konw where the 'aten::to' is from.
could you please add the code below and show me the result? if the result is too long you can upload the output text file.
code:
traced_model = torch.jit.trace(model, torch.rand(1, 3, 256, 256))
torch._C._jit_pass_inline(traced_model.graph)
print('traced_model.graph has aten::to:', 'aten::to' in str(traced_model.graph))
if 'aten::to' in str(traced_model.graph):
print('traced_model.graph after inline:', str(traced_model.graph))
exit()
position: between pruner._unwrap_model()
and ModelSpeedup(model, torch.rand(1, 3, 256, 256), masks).speedup_model()
and i try to write a fix in https://github.com/Louis-J/nni/blob/fix_5148/nni/compression/pytorch/speedup/jit_translate.py. please replace your local jit_translate.py and try it. i think it can solve the 'aten::to' problem.
and i try to write a fix in https://github.com/Louis-J/nni/blob/fix_5148/nni/compression/pytorch/speedup/jit_translate.py. please replace your local jit_translate.py and try it. i think it can solve the 'aten::to' problem.
Thanks a lot! The problem has been solved.
Describe the issue: Hello, I'm using NNI to prune the detection model ssdlite-mobilenetV2 in mmdetection (https://github.com/open-mmlab/mmdetection). And an error as shown in the title occurred when running the function ModelSpeedup(model, torch.rand(1, 3, 256, 256).cuda(), masks).speedup_model(). The forward process works fine while the update mask for .aten::to fails. Is there any solutions. Thanks!
Environment:
Configuration:
Log message: [2022-09-28 20:28:03] start to speedup the model [2022-09-28 20:28:07] infer module masks... [2022-09-28 20:28:07] Update mask for backbone.conv1.conv [2022-09-28 20:28:07] Update mask for .aten::to.249 Traceback (most recent call last): File "demo/image_demo.py", line 92, in
main(args)
File "demo/image_demo.py", line 58, in main
ModelSpeedup(model, torch.rand(1, 3, 256, 256).cuda(), masks).speedup_model()
File "/home/dsplxp/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py", line 543, in speedup_model
self.infer_modules_masks()
File "/home/dsplxp/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py", line 380, in infer_modules_masks
self.update_direct_sparsity(curnode)
File "/home/dsplxp/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py", line 234, in update_direct_sparsity
_auto_infer = AutoMaskInference(
File "/home/dsplxp/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/nni/compression/pytorch/speedup/infer_mask.py", line 80, in init
self.output = self.module(dummy_input)
File "/home/dsplxp/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/nni/compression/pytorch/speedup/jit_translate.py", line 244, in call
result = self.func(self.positional, **self.keyword)
TypeError: to() received an invalid combination of arguments - got (memory_format=NoneType, copy=bool, non_blocking=bool, pin_memory=NoneType, device=torch.device, layout=torch.layout, dtype=torch.dtype, ), but expected one of: