wl-zhao / VPD

[ICCV 2023] VPD is a framework that leverages the high-level and low-level knowledge of a pre-trained text-to-image diffusion model to downstream visual perception tasks.
https://vpd.ivg-research.xyz
MIT License
499 stars 30 forks source link

Can you release the training logs on Depth estimation on NYUv2? #49

Open develop-productivity opened 11 months ago

develop-productivity commented 11 months ago

Hi,Thank you for providing this exploratory work on diffusion on visual perception tasks.

When I train on the NYUv2 dataset, I find that it converges slowly, and I wonder if there is something wrong with the training process.

Therefore can you release the training log for depth estimation? If you could release the training log, I believe it would greatly help me in reproducing the results. Thank you.

wl-zhao commented 11 months ago

Hi, thanks for your interest in our work.

Here is the log on NYUv2: vpd-logs-depth.txt

develop-productivity commented 11 months ago

this code image

Hi, I wonder to know if there is something wrong here. The test image is an uncropped 640*480 resolution image. When running this code, an error will be thrown.

develop-productivity commented 11 months ago

Hi, thanks for your interest in our work.

Here is the log on NYUv2: vpd-logs-depth.txt

Thanks you so much.