Open MaureenZOU opened 5 months ago
IIUC: this recent paper from @yossigandelsman and others claims that the weight space of diffusion models also has interpretable latent spaces and so perhaps these could also be tested for alignment as you suggest.
As for the question, I am very curious about the vision encoder with diffusion models, and how does this align with the semantic world.