taohan10200 / IIM

PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"
MIT License
163 stars 39 forks source link

Have you compare the result with other semantic segmentation methods? #1

Closed streamer-AP closed 3 years ago

streamer-AP commented 3 years ago

It is interesting to consider the crowd localization as as segmentation task, impressive!

I wonder to know that have you compare your methods with other well known segmentation methods? It seems that common segmentation network can also be trained with the mask.

Also, in Table 3, it seems that the lower fixed value of threshold, the better performance will be got. Have you try thresholds lower than 0.5? As far as I think, if the center is the only thing needed, the lower threshold will have better performance.

By the way, your IBM/PBM module looks also suitable for other segmentation task, have you test it on other dataset such as COCO?

Best regards.

gjy3035 commented 3 years ago

1.We didn't test other well-known segmentation models for crowd localization. The two segmentation models (HRNet and VGG+FPN) adapted in the paper have been able to demonstrate that the segmentation method can be exploited to locate the crowd. Besides, this paper's model is slightly different from the traditional segmentation models, even though they both use the mask. In fact, this paper focuses on regressing the coarse confidence map and then do further research (e.g., IBM/PBM) to binarize the confidence map into the segmentation map. We believe other well-performance segmentation models can improve the confidence map's quality and produce more accurate localization results, but the current work does not aim to discuss it. It is just a submodule of our approach. In the future, we may utilize some mainstream segmentation networks to further improve performance, such as Deeplab and so on. 2.This is an interesting phenomenon. According to our experiment, the results on the validation set are F1_0.748, Pre_0.874, Rec_0.654,mae_146.3, and mse_672.6 under the threshold of 0.3. The localization performance is better, but the counting performance gets worse. The reason is that lower thresholds cannot binarize the confidence map well in some glued regions, seriously affecting the counting effect in some scenarios.

  1. Due to the limited time and resources, we did not do relevant experiments on the segmentation task. We believe that the proposed scheme could be suitable for other segmentation tasks.

Thanks for your attention!

streamer-AP commented 3 years ago

Thank you for your patience! Looking forward to your future work!