geopavlakos / hamer

HaMeR: Reconstructing Hands in 3D with Transformers
https://geopavlakos.github.io/hamer/
MIT License
326 stars 28 forks source link

Configuration Setting used to train the SOTA model #38

Closed aviralchharia closed 3 months ago

aviralchharia commented 4 months ago

Hi, thank you for the amazing paper. It is great to see 3D Hand Reconstruction using fully transformer-based approach.

From the current setting in the code, I see you trained HaMeR for 1_000_000 steps with 8 GPUs & Batch_Size of 8 (so effectively= 64 batch size). Also, a sum() loss is used in the study.

I found PyTorch Lightning is affected by Batch_Size, GPUs and results are not consistent with change in No. of GPUs even with same effective Batch_Size: https://github.com/Lightning-AI/pytorch-lightning/issues/6789. Can the authors comment on the configuration setting they used for training the SOTA model reported in the paper?

1) Is the released configuration already the same that was used for training the SOTA model? i.e., training set as ddp? 2) Was the filter_poses set = False during training? FILTER_NO_POSES: False # If True, filters images that don't have poses 3) For an apples-to-apples comparison, does one need to keep the same GPU configuration? I don't have access to a large GPU cluster, and therefore was thinking of training for 8_000_000 steps on 1 GPU and Batch_Size of 8. Do you think under such configuration, we can achieve same results as HaMeR?

geopavlakos commented 3 months ago

I am not aware of this pytorch-lightning bug. It is likely that the exact configuration matters, but we were mostly tracking the effective batch size during training, not how that was distributed across GPUs.

The default configuration in the code is mostly the same with the one we used to train our public model - the only thing that is different is the effective batch size. In the released code, we have a small batch size so that more users could run an example training run, but for our model, we train with effective batch size of 1024 for 420k iterations.

We didn't experiment with longer runs, but I expect that the effective batch size is important. Training for longer with a small batch size probably will not lead to the same results.

aviralchharia commented 3 months ago

Thank you so much!