dk-liang / FIDTM

[IEEE TMM] Focal Inverse Distance Transform Maps for Crowd Localization
MIT License
169 stars 41 forks source link

what's the difference between the proposed maps and small kernel-size gaussian maps ? #18

Closed Exely closed 3 years ago

Exely commented 3 years ago

Thank you for your inspiring work. However, I don't understand the motivation of FIDTM maps. In visualization, the FIDTM you proposed seems to be similar to the traditional Gaussian map if you set the kernel size small enough. What is the difference between these two maps? Have you compared the counting performance or localization performance of these two maps?

dk-liang commented 3 years ago

Thank you for your inspiring work. However, I don't understand the motivation of FIDTM maps. In visualization, the FIDTM you proposed seems to be similar to the traditional Gaussian map if you set the kernel size small enough. What is the difference between these two maps? Have you compared the counting performance or localization performance of these two maps?

Using the small Gaussian kernel (e.g., kernel =2, 4) to generate the density map will exist overlap in the extremely dense region. Another way is directly to use the point map to represent the crowd, without overlap in the crowd, but the point map lacks supervised information (regressing a point map is difficult), causing undesirable counting and localization performance. The FIDT maps provide enough supervision information and without overlap in the dense region, having the advantage of both point map and density map. In our experiment, the localization performance of FIDT maps is significantly better than the density maps, while the counting result of both is similar.

dk-liang commented 3 years ago

Thank you for your inspiring work. However, I don't understand the motivation of FIDTM maps. In visualization, the FIDTM you proposed seems to be similar to the traditional Gaussian map if you set the kernel size small enough. What is the difference between these two maps? Have you compared the counting performance or localization performance of these two maps?

Additionally, our recent work[1] has also proved that using small Gaussian kernels is not suitable for the crowd localization task, as shown in Tab12 in AutoScale[1]. [1] AutoScale: Learning to Scale for Crowd Counting and Localization. https://arxiv.org/pdf/1912.09632.pdf