Open sayakpaul opened 2 years ago
@NielsRogge a couple of things worth discussing as we work this through:
LMKWYT.
DPT currently does not have the loss implemented required for training on depth estimation. This is fine since GLPN has it and we can write the tutorial with GLPN.
We can add it based on this repo: https://github.com/isl-org/MiDaS. The DPT repo only included inference code.
However, the image preprocessor of GLPN doesn't consider preprocessing of the depth maps. This is a bit different from the other image preprocessors used for dense prediction tasks such as semantic segmentation (an example). I am not saying that we must also include the depth maps in the preprocessors but wanted to know if that is what we want.
I guess we can add basic preprocessing of depth maps to it, although adding keyword arguments will be a breaking change. I wonder which data augmentation libraries can be used here in addition to the processor.
Sounds good, @NielsRogge. I can work on the first issue for now (adding loss to DPT) after the I have a basic fine-tuning pipeline ready. LMKWYT.
For augmentation, I guess imgaug
is a good choice: https://imgaug.readthedocs.io/en/latest/source/examples_heatmaps.html
Cc: @NielsRogge @osanseviero @nateraw