YvanYin / Metric3D

The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."
https://jugghm.github.io/Metric3Dv2/
BSD 2-Clause "Simplified" License
1.39k stars 105 forks source link

Questions on fine-tuning freeze and config parameters #173

Open tazalapizza opened 1 week ago

tazalapizza commented 1 week ago

Hello, thanks for sharing this amazing work with all the codes and weights ! I created my own RGB-D dataset with custom dataloaders following #105 and ran a finetuning

My finetuning results are not very good with base KITTI parameters, so I want to better understand them :

1) I added encoder freeze, is it a good idea ?

2) how should I choose crop_size values in data_basic ? I set it to my image size

3) the optimizer and lr hyperparameters in ..kitti.py configs are set for finetuning, right ?

4) should we change the Normalize values in the pipeline for custom data ?

5) is there a special way to handle the sky region in GT depth ? I set it to 0 to ignore it

Sorry for my many questions, I hope you can help me. Thank you !

JUGGHM commented 1 week ago

Hello, thanks for sharing this amazing work with all the codes and weights ! I created my own RGB-D dataset with custom dataloaders following #105 and ran a finetuning

My finetuning results are not very good with base KITTI parameters, so I want to better understand them :

  1. I added encoder freeze, is it a good idea ?
  2. how should I choose crop_size values in data_basic ? I set it to my image size
  3. the optimizer and lr hyperparameters in ..kitti.py configs are set for finetuning, right ?
  4. should we change the Normalize values in the pipeline for custom data ?
  5. is there a special way to handle the sky region in GT depth ? I set it to 0 to ignore it

Sorry for my many questions, I hope you can help me. Thank you !

Thanks for your questions and I hope these may help:

  1. It will impair some performance. (The performance gain will shrink for fine-tuning.)

  2. I think we have provided a json file to train KITTI and you do not need to change or choose it.

  3. No, this file is not correctly configured. I think the settings have been correctly overridden here.

  4. Not likely.

  5. If you have a well segmented map, you can set the sky groundtruth to 200. Ignoring them is another alternative and the confidence map will help to filter them out.

In general, I think we should use this script. Not need to change any settings additionally.