taohan10200 / IIM

PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"
MIT License
169 stars 41 forks source link

Confidence level? #33

Open fatbringer opened 1 year ago

fatbringer commented 1 year ago

Hi! Thanks for the wonderful repo for head detecting and counting people.

I'm wondering other than position of the head location detected, is it possible to also get any confidence level or probability map?

taohan10200 commented 1 year ago

Hi, our model can output the confidence map. Please refer to the Figure. 2 for a check.

fatbringer commented 1 year ago

@taohan10200 I see

From the readme.md: "The sub images are the input image, GT, prediction map,localization result, and pixel-level threshold, respectively: " vis Is it the one in the middle bottom?

If i want to extract the array of the confidence levels, is it line 144 in test.py? pred_map = torch.zeros(b, 1, h, w).cpu() this pred_map variable?

taohan10200 commented 1 year ago
  1. Is it the one in the middle bottom? No, the middle bottom is the binary map, it actually is the last third one in the top row.

  2. Yes, pred_map can represent the confidence level.

fatbringer commented 1 year ago

Hey @taohan10200 im back again. So i have been looking at the pred_map variable, and the points variable and i found something interesting

Screenshot from 2023-11-02 16-36-33

I noticed that sometimes only 1 point is generated for a few nearby squares. Also to check with you, can the pred_threshold be made more lenient? So that the count can be increased?

I notice this variable "mask"

            pred_map = (pred_map / mask)
            pred_threshold = (pred_threshold / mask)

what does the mask actually do?

taohan10200 commented 1 year ago
  1. you can lower the threshold to get more count, but some noise would be miscounted when the threshold is too small.

  2. The mask is used for the inference of the high-resolution image. When the high-resolution image is cropped to some patches, there are maybe some over region, thus the mask represents the overlap region of those patches.

fatbringer commented 1 year ago

If i want to allow for more count, i am ok with miscount. What is a good adjustment i can use?

Should i like multiply the threshold by a small number? like threshold * 0.5 ? or should i do cv2.dilate on the binary map produced?

Sometimes my image is at night, so the pred_map actually has some valid values, but too small and end up being not counted.

taohan10200 commented 1 year ago

You can lower the threshold when transform the pred_map to a binary map.

fatbringer commented 1 year ago

I tried on a few images, and these are the values i got from the pred_map

<!DOCTYPE html>

25th percentile | Median | 75th percentile | 90th percentile | Max -- | -- | -- | -- | -- 0.000227584787353408 | 0.00065783877 | 0.00176704095792957 | 0.0109336498193443 | 0.93879163 0.000194667365576606 | 0.00053325976 | 0.00165790767641738 | 0.00777001436799764 | 0.8905558 0.000234340786846587 | 0.00092632766 | 0.00920665194280446 | 0.0882901914417744 | 0.9853317 0.000168121478054672 | 0.0004524978 | 0.00122064480092376 | 0.00313314381055534 | 0.94432545 0.000218632791074924 | 0.00065056863 | 0.00357176392572001 | 0.0424249794334173 | 0.97327846 0.00041541330574546 | 0.0018048736 | 0.0203839614987373 | 0.147194217145443 | 0.99654627 0.000632454815786332 | 0.0024422049 | 0.0156496367417276 | 0.0634994350373745 | 0.9600536 0.000240619490796234 | 0.0006701163 | 0.0019299341365695 | 0.00715797664597631 | 0.8866419 0.00028129038400948 | 0.00078481884 | 0.00266420183470473 | 0.015467349998653 | 0.9374716 0.000315061377477832 | 0.0009813908 | 0.00379698618780822 | 0.0280545573681593 | 0.99406016 0.000528485950781032 | 0.0015583441 | 0.00757352309301496 | 0.044178881123662 | 0.9543941 0.000236529886024073 | 0.0006938946 | 0.00341518298955634 | 0.0258885353803635 | 0.9731425 0.0002100293750118 | 0.00046620745 | 0.00142796660657041 | 0.0261044861748814 | 0.9620243 0.000391625668271445 | 0.0014016973 | 0.00722801988013089 | 0.0547232337296009 | 0.9972284 0.000338588404702023 | 0.0010491939 | 0.00416168849915266 | 0.0228024385869503 | 0.95380545 0.00212409498635679 | 0.013215018 | 0.0577034335583448 | 0.223456771671772 | 0.9940021 0.000407069113862235 | 0.0013056945 | 0.00517463218420744 | 0.0229147665202617 | 0.95492464 0.00826335977762938 | 0.03685826 | 0.153363801538944 | 0.411073824763298 | 0.97953063 0.000324716442264616 | 0.0014018507 | 0.00628324795980006 | 0.0276476550847292 | 0.9497025 0.000359946090611629 | 0.0024385485 | 0.041068715043366 | 0.181282731890678 | 0.9545076 0.000177024761796929 | 0.00042649882 | 0.00100560131249949 | 0.00255148208234459 | 0.9446191 0.000398650074203033 | 0.0013338285 | 0.0114095346070826 | 0.0938105553388597 | 0.9411398 0.00054581837321166 | 0.002175008 | 0.0143224969506264 | 0.0861815460026267 | 0.90224934 0.000533784361323342 | 0.004247153 | 0.052973534911871 | 0.198961299657822 | 0.9726509 0.000430250045610592 | 0.0013449136 | 0.00916040572337806 | 0.0811791822314263 | 0.9484006 0.000891148651135154 | 0.0050512687 | 0.0408252645283937 | 0.126598091423512 | 0.92680895 0.000466320678242482 | 0.0012214757 | 0.0034151166328229 | 0.0102005018852651 | 0.8576134 0.000356850032403599 | 0.0013249489 | 0.00636411120649427 | 0.0430008441209793 | 0.96803534 0.000626671375357546 | 0.0016192745 | 0.00488035578746349 | 0.0307323243469 | 0.9404692 0.000227514945436269 | 0.00058530585 | 0.00149741262430325 | 0.00430293162353337 | 0.9438938 0.000496680644573644 | 0.0016554631 | 0.0111775402911007 | 0.0576809324324132 | 0.91685975 0.000175861074239947 | 0.00046023302 | 0.00112474046181887 | 0.00328343338333071 | 0.97611463 0.000300693951430731 | 0.0009975006 | 0.00365002348553389 | 0.0157472033053637 | 0.9390901 0.000155709774844581 | 0.00042323183 | 0.00113594910362735 | 0.00309471501968801 | 0.9357061 0.000198222358449129 | 0.00050764985 | 0.00140502038993873 | 0.00452038552612067 | 0.99700755 0.000579676518100314 | 0.0015869758 | 0.00525062158703804 | 0.0221172735095024 | 0.94972193 0.000336218829033896 | 0.0015319079 | 0.0175112709403038 | 0.1253966152668 | 0.95290637 0.000575630474486388 | 0.0012831714 | 0.00320056668715551 | 0.0115161083638668 | 0.9220723 0.000824070593807846 | 0.00418724 | 0.0336605682969093 | 0.113897684216499 | 0.93831986 0.00075974794162903 | 0.004304463 | 0.0319418758153915 | 0.103274673223496 | 0.96269846 0.000934409719775431 | 0.003661763 | 0.0269821644760668 | 0.140307494997978 | 0.98259616 0.000905818204046227 | 0.0036669944 | 0.0232212422415614 | 0.11286867633462 | 0.9553211 0.000275519159913529 | 0.0007743646 | 0.00299293280113488 | 0.0401690136641264 | 0.95771384 0.000341998886142392 | 0.00097436825 | 0.00302345457021147 | 0.017917924374342 | 0.9285381 0.000477230569231324 | 0.0015343723 | 0.00538706476800144 | 0.0265436189249158 | 0.9690722 0.000129914584249491 | 0.00048877863 | 0.00279881269671023 | 0.0423766769468784 | 0.8770999 0.000146102131111547 | 0.00042334554 | 0.00182684816536494 | 0.0193046942353249 | 0.95974356 0.00033833592897281 | 0.0009234102 | 0.00301122071687132 | 0.01575922742486 | 0.9986534

what's a good value to adjust the pred_threshold by?

I have tried adding median, multiply by 0.5, by 0.01. None of them feel very sensible.