Closed zeckireck closed 2 years ago
If you do not want to use deconvolutional layers with the group number greater than 1, you may try the backbone searched by ViPNAS (e.g. ViPNAS_ResNet following s_vipnas_res50_coco_256x192.py), and apply the TopdownHeatmapSimpleHead as the keypoint_head following SimpleBaseline. Make sure that the in_channels of keypoint_head should be consistent with the output_channels of backbone (e.g. 608 for S-ViPNAS-Res50).
Also, if you still want to use ViPNASHeatmapSimpleHead, you can set num_deconv_groups to be (1, 1, 1). No matter which one you choose, please retrain the new model and verify whether the performance satisfies your needs.
are these factors (num_deconv_layers, num_deconv_filters, num_deconv_kernels) need to be modified?
or only change num_deconv_groups to be (1, 1, 1)
You may try to change num_deconv_groups
only. As it is no longer the keypoint head searched by ViPNAS, you should verify the accuracy under this setting. If necessary, you can also change other arguments.
Why are there 2 inferences here? Just to infer for once, can't it? Will there be a big impact on accuracy and confidence?https://github.com/luminxu/ViPNAS/blob/56a0630efee9d36595c0f5d3268553f273d35ad3/mmpose/models/detectors/top_down.py#L184
Flip Test
is a common test-time technique for more robust prediction, which empirically introduces around 1% AP improvement. If you care more about the inference speed, you can turn it off in the configuration.
As for resnet50, do you know where this operator is in the code? Our hardware only supports the case where the divisor of the div operator is constant。
I cannot recognize the exact operators according to the figures. Also, I am not sure when the problem happens, e.g., instantiation (I guess so), initialization, or inference. According to your description, you may check this line.
Is there any way to set out_channels to a fixed value?https://github.com/luminxu/ViPNAS/blob/main/mmpose/models/backbones/vipnas_resnet.py#L115
What is this attention used for?https://github.com/luminxu/ViPNAS/blob/main/mmpose/models/backbones/vipnas_resnet.py#L115is it necessary?
Whether to use attention module is included in search spaces and is searched for better pose estimation performance. Please refer to our paper for more details.
https://github.com/luminxu/ViPNAS/blob/main/mmpose/models/backbones/vipnas_resnet.py#L115. Why do we need 16/outchannels here? only to use 1/16,cant it?
Hello, our script does not support the case that convtranpose group is greater than 1. How can we turn it into the case that each group is equal to 1?