[ICCV 2023] VPD is a framework that leverages the high-level and low-level knowledge of a pre-trained text-to-image diffusion model to downstream visual perception tasks.
Dear author!
We are interested in your high-quality and excellent work.
We want to explore the ability of depth estimation when the model does not introduce text information. As shown in the image.
However, we forced self.conditioning key == None, and an error occurred during this process.
Does the author have any good solution?
Thank you very much, I look forward to your reply!
@wl-zhao
Dear author! We are interested in your high-quality and excellent work. We want to explore the ability of depth estimation when the model does not introduce text information. As shown in the image.
However, we forced self.conditioning key == None, and an error occurred during this process. Does the author have any good solution? Thank you very much, I look forward to your reply!