geopavlakos / hamer

HaMeR: Reconstructing Hands in 3D with Transformers
https://geopavlakos.github.io/hamer/
MIT License
326 stars 28 forks source link

Fixed input size for pre-trained model #21

Closed TRS07170 closed 6 months ago

TRS07170 commented 7 months ago

Dear authors,

Thank you for your amazing work. I was playing with the demo model. I tried to input images of size other than 256, and the model raised the following error: assert model_cfg.MODEL.IMAGE_SIZE == 256, f"MODEL.IMAGE_SIZE ({model_cfg.MODEL.IMAGE_SIZE}) should be 256 for ViT backbone" AssertionError: MODEL.IMAGE_SIZE (224) should be 256 for ViT backbone I was not sure if there's anyway that I can adjust the model so it can accept images of different sizes? Or I should train my own models?

geopavlakos commented 6 months ago

The model takes a crop of an image of a hand. The best practice is to resize that crop to a 256x256 image. Even if you have images of larger or smaller size, it is better to just resize before passing to the network.