facebookresearch / PoseDiffusion

[ICCV 2023] PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment
Other
697 stars 40 forks source link

how to select len_train #39

Open ylyyim opened 1 month ago

ylyyim commented 1 month ago

Hello, I noticed in the cfg file that you set len_train to 16384. How did you arrive at this number? Was it obtained by dividing the total number of training images by the maximum image count?

jytime commented 1 month ago

Hi,

It is a not important hyperparameter. len_train only controls how long an epoch would be. I chose it because it will make each epoch take around 30 mins in my machine. You can set any reasonable numbers.

ylyyim commented 1 month ago

I used this method on two datasets, but the network cannot achieve accurate results, with an accuracy of less than 10%. Can you speculate on what the reasons might be? Can you also tell me if there are any specific aspects to pay attention to during model training? Additionally, is the results of this method closely related to the datasets? I also noticed that the focal length and principal point are normalized to [-1, 1]. For the focal length, should it be divided by half of the image size? Thank you for your response.

jytime commented 1 month ago

Hi,

  1. Can you first check if you can reproduce our result on co3d with our pretrained model? This will ensure your env is correct.
  2. When you refer to accuracy, is it mAUC or something?
  3. It is hard to guess the reason without looking at your data. Which kind of images are they?
  4. The model should generalize to normal images well. Please first ensure that the pretrained model works well and then try your own trained model.
  5. Regarding focal length normalization, you can follow the code here https://github.com/facebookresearch/PoseDiffusion/blob/36eeb1654ad8e67844672d7a40e7c179e9c58104/pose_diffusion/datasets/re10k.py#L267
  6. If you need a high accuracy, I would suggest to try our follow-up work, https://github.com/facebookresearch/vggsfm