There are many other models to consider (like MiDaS for Depth estimation). But I think SD bring somethig new because of its UNET backbone and its generative loss. Screenshot from the Probe3D paper:
Wow, thanks for putting this together. We'll look into it. In general, we agree that adding a generative diffusion model could make for a great addition.
I think Stable Diffusion would be a great teacher for AM-RADIO aswell because of the following papers:
BTW: SD-based methods achieve SOTA on semantic-correspondence
There are many other models to consider (like MiDaS for Depth estimation). But I think SD bring somethig new because of its UNET backbone and its generative loss. Screenshot from the Probe3D paper:
Update: Code for obtain feats from SD2.1
Links from the DIFT repo: