dk-liang / FIDTM

[IEEE TMM] Focal Inverse Distance Transform Maps for Crowd Localization
MIT License
173 stars 42 forks source link

sha initail lr is 1e-4 and crop size is 256x256? can not reproduce the res when i train from scatch #16

Closed knightyxp closed 3 years ago

knightyxp commented 3 years ago

Hrnet(pretrain). mse | MAE 66.049 | | MSE 105.703

rydenisbak commented 1 year ago

Hello @knightyxp. 1) this baseline code don't provide I-SSIM loss -> your baseline, according to the paper for L2 loss is 62.1 MAE. 2) As you can see the training process is unstable -> you should try several random seed and check val mse every epoch.

I got 61.9 val mae on 1770 epoch, seed was 89.

May be LR warmup or mean count value for several thresholds (for example instead 100 you can try range(95, 106) and mean) can improve stability