ZhenglinZhou / STAR

[CVPR 2023] STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
157 stars 17 forks source link

About modifying model input size #29

Closed realphongha closed 3 months ago

realphongha commented 5 months ago

Hi, first of all thanks for your good work with STAR loss. I encountered an issue when trying to reduce the network input size from (256, 256) to (112, 112) and train the model on 300W from scratch:

Screenshot 2024-04-08 at 10 07 40

Training process stills work fine with original configs (256x256). Do you have any idea what to change to fix this?

Btw, could you give me some advice on how to prepare other hyperparams when changing the input size? Thank you very much.

ZhenglinZhou commented 4 months ago

Hi @realphongha, thanks for your interest!

We carried out the resolution ablation study at 128px and 64px. Have you tested these two settings and encountered the this error again? The hyperparameters remain the same as with 256px. We observed that using a small batch size (e.g. bs=32) can yield more stable results.

Feel free to leave comments, and looking forward to receiving your feedback!

realphongha commented 3 months ago

Thanks for your answer. An input size of 128x128 worked fine for me without any bugs. With bs=64 there's only a small drop in accuracy (i.e., NME=2.99 on 300W) when compared to 256x256 so I think it should be fine.