Open samymdihi opened 3 years ago
Hi, from your description, looks like the problem is unbalance between count loss and OT loss.
The hyperparameters used in the code are tuned for dense images. In your case, you can try to add a higher weight for count loss [currently it is set to 1] or relatively reduce the weight for OT loss.
You can also try changing the regualizer in sinkhorn. This controls how spread the heatmap is.
Thanks a lot for your answer. Yes there seem to be an unbalance between count loss and OT loss as you can see on the logs :
INFO - -----Epoch 999/1000----- 2021-11-07 02:25:44,875 - INFO - Epoch 999 Train, Loss: 0.04, OT Loss: 7.70e-08, Wass Distance: 125.92, OT obj value: 10.93, Count Loss: 0.03, TV Loss: 0.01, MSE: 0.07 MAE: 0.03, Cost 121.7 sec
2021-11-07 02:25:45,134 - INFO - -----Epoch 1000/1000----- 2021-11-07 02:27:48,178 - INFO - Epoch 1000 Train, Loss: 0.04, OT Loss: -6.48e-08, Wass Distance: 126.12, OT obj value: 10.94, Count Loss: 0.03, TV Loss: 0.01, MSE: 0.07 MAE: 0.03, Cost 123.0 sec
I tried to put a higher weight for the count loss but it doesn't seem to have a significant impact on the inference. Do you still think it is a good idea to reduce the weight loss considering that my images are not really dense ? Does the OT loss really have an impact for my problem ?
Hi, from your description, looks like the problem is unbalance between count loss and OT loss.
The hyperparameters used in the code are tuned for dense images. In your case, you can try to add a higher weight for count loss [currently it is set to 1] or relatively reduce the weight for OT loss.
You can also try changing the regualizer in sinkhorn. This controls how spread the heatmap is.
Assign higher value for regualizer in sinkhorn? Then how about the tv loss, and the reg, num_of_iter_in_ot, norm_cood?
Hello, thanks for the great work!
Do you have any advice on how to use your work with smaller images that are not technically crowds but more occlusions of two, three, four persons. I tried to retrain the model and I am getting heatmaps that localize the head pretty well but it doesn't count the people on the image properly.
For example, I can see that two areas have been located on the heatmap, but only one person is counted in the end. Do you recommend to change something in the code for images that are less dense than a crowd. Also, for your information, I trained on a dataset of small images (approximately 100x60 pixels), therefore I added some padding to reach the size of 512x512.
Any advice would be highly appreciated. Thanks