tinyvision / SOLIDER

A Semantic Controllable Self-Supervised Learning Framework to learn general human representations from massive unlabeled human images, which can benefit downstream human-centric tasks to the maximum extent
Apache License 2.0
1.92k stars 346 forks source link

Request for PedestrianDetection pretrained model #21

Open DJ-Wu-Git opened 9 months ago

DJ-Wu-Git commented 9 months ago

博主您好! 我使用SOLIDER上pretrained的模型进行测试,报了如下的错误

/home/cddjjc/anaconda3/envs/pedestron_v2/bin/python /home/cddjjc/Workspace/SOLIDER-PedestrianDetection/test_city_person.py configs/solider/cp/swin_base.py models_pretrained/solider_origin/swin_base/epoch_ 1 2 --out swin_base.json --show --mean_teacher 
loading annotations into memory...
Done (t=0.03s)
creating index...
index created!
No pre-trained weights for SwinBase, training start from scratch
unexpected key in source state_dict: backbone.norm0.weight, backbone.norm0.bias, head.mlp.0.weight, head.mlp.0.bias, head.mlp.2.weight, head.mlp.2.bias, head.mlp.4.weight, head.mlp.4.bias, head.last_layer.weight_g, head.last_layer.weight_v

missing keys in source state_dict: bbox_head.reg_convs.0.gn.bias, bbox_head.offset_scales.0.scale, bbox_head.cls_convs.0.conv.weight, neck.p3_l2.weight, bbox_head.reg_convs.0.conv.weight, bbox_head.cls_convs.0.gn.bias, bbox_head.csp_reg.weight, neck.p4_l2.weight, bbox_head.csp_cls.weight, neck.p5_l2.weight, bbox_head.cls_convs.0.gn.weight, bbox_head.offset_convs.0.conv.weight, bbox_head.csp_offset.bias, neck.p4.bias, bbox_head.csp_offset.weight, neck.p5.weight, bbox_head.reg_scales.0.scale, bbox_head.csp_cls.bias, neck.p4.weight, bbox_head.reg_convs.0.gn.weight, bbox_head.csp_reg.bias, bbox_head.offset_convs.0.gn.bias, neck.p3.weight, bbox_head.offset_convs.0.gn.weight, neck.p5.bias, neck.p3.bias

[                              ] 0/500, elapsed: 0s, ETA:Traceback (most recent call last):
  File "/home/cddjjc/Workspace/SOLIDER-PedestrianDetection/test_city_person.py", line 227, in <module>
    main()
  File "/home/cddjjc/Workspace/SOLIDER-PedestrianDetection/test_city_person.py", line 195, in main
    outputs = single_gpu_test(model, data_loader, args.show, args.save_img, args.save_img_dir)
  File "/home/cddjjc/Workspace/SOLIDER-PedestrianDetection/test_city_person.py", line 30, in single_gpu_test
    result = model(return_loss=False, rescale=not show, **data)
  File "/home/cddjjc/anaconda3/envs/pedestron_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/cddjjc/anaconda3/envs/pedestron_v2/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 169, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/cddjjc/anaconda3/envs/pedestron_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/cddjjc/Workspace/SOLIDER-PedestrianDetection/mmdet/core/fp16/decorators.py", line 49, in new_func
    return old_func(*args, **kwargs)
  File "/home/cddjjc/Workspace/SOLIDER-PedestrianDetection/mmdet/models/detectors/base.py", line 88, in forward
    return self.forward_test(img, img_meta, **kwargs)
  File "/home/cddjjc/Workspace/SOLIDER-PedestrianDetection/mmdet/models/detectors/base.py", line 79, in forward_test
    return self.simple_test(imgs[0], img_metas[0], **kwargs)
  File "/home/cddjjc/Workspace/SOLIDER-PedestrianDetection/mmdet/models/detectors/csp.py", line 203, in simple_test
    x = self.extract_feat(img)
  File "/home/cddjjc/Workspace/SOLIDER-PedestrianDetection/mmdet/models/detectors/single_stage.py", line 42, in extract_feat
    x = self.neck(x)
  File "/home/cddjjc/anaconda3/envs/pedestron_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/cddjjc/Workspace/SOLIDER-PedestrianDetection/mmdet/core/fp16/decorators.py", line 49, in new_func
    return old_func(*args, **kwargs)
  File "/home/cddjjc/Workspace/SOLIDER-PedestrianDetection/mmdet/models/necks/csp_neck.py", line 73, in forward
    p3 = self.p3(inputs[0])
  File "/home/cddjjc/anaconda3/envs/pedestron_v2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/cddjjc/anaconda3/envs/pedestron_v2/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 958, in forward
    output_padding, self.groups, self.dilation)
RuntimeError: Given transposed=1, weight of size [512, 256, 4, 4], expected input[1, 256, 128, 256] to have 512 channels, but got 256 channels instead

Process finished with exit code 1

应该是SOLIDER上的pretrained model缺少了最后几层的权重。请问是否方便提供一下训练好的PedestrianDetection的完整模型?感谢!

cwhgn commented 7 months ago

SOLIDER提供的pretrained model是作为下游任务(比如PedestrianDetection)的初始化预训练模型用的,其作用类似于Swin。下游任务在使用了预训练模型之后,也是需要专门训练的,并不能用预训练模型直接inference。