Yiming-M / CLIP-EBC

The official implementation of the crowd counting model CLIP-EBC.
MIT License
43 stars 7 forks source link

Json configs #15

Closed BurstLink666 closed 3 months ago

BurstLink666 commented 3 months ago
          > Hi @BurstLink666 ,

Thanks for report this issue! I forgot to update eval.py, which caused this bug. I have fixed it and things should work fine now! Feel free to reopen this issue if that's not the case.

Many thanks, Yiming

Everything goes well now, thank you very much.

Here I have another question. In the .json file in configs folder, it seems that you provide different reduction strategies for different dataset. Can you explain the meaning of the attribute "bins" and "anchor_points", especially the values? And if I want to adapt your work to a new dataset, how to set these values?

Originally posted by @BurstLink666 in https://github.com/Yiming-M/CLIP-EBC/issues/14#issuecomment-2212953399

Yiming-M commented 3 months ago

The commonly reduction strategy used in the paper is reduction by 8, following many regression based methods, like DMCount and CANNet. In the json file there might be multiple reduction strategies reported as we would like to see how different reduction strategies and different binning policies would affect the model's performance, as reported in Table 2.

Anchor count values are called the "representative count values" in our paper, and they are the average count values in each bin. If the bin is something like $[1, 1]$, then the representative count value for this bin is 1. If the bin is like $[2, 3]$ and we use the block size 8, then we run an 8x8 sum filter on the density map to find the all local count values. Then, we find the local count values that fall into this bin and calculate their average as the representative count value (i.e., the anchor count value).

For simplicity, you may use bins with only one element in each (e.g., $[0,0]$, $[1,1]$, $[2,2]$, $[3,3]$, $[4,4]$). In this case, the anchor count values will be 0, 1, 2, 3, 4.

Using bins with more than one elements can increase the number of samples that fall into them, but at the same time, calculating the average can be cumbersome. Another way is to use the middle point of each bin as the representative value (e.g., 4.5 for $[4, 5]$), but this is less accurate as the local count values demonstrate a long-tail distribution.

BurstLink666 commented 3 months ago

Got it. Thank you very much again for answering my question.