bytedance / DEADiff

[CVPR 2024] Official implementation of "DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations"
Apache License 2.0
190 stars 4 forks source link

Can I get the middle representation of model? #8

Open XiaojieGan opened 2 months ago

XiaojieGan commented 2 months ago

Hello!I want to decouple style information from image features. From the inference code you released, can I only obtain the decoupled output of QFormer, rather than the final image generated after adding text?

Tianhao-Qi commented 2 months ago

Yes, you can refer to the function get_learned_conditioning in ldm/models/diffusion/blip_diffusion.py.

XiaojieGan commented 2 months ago

OK,thanks for your answer.Does the return value of this function “encoder_hidden_states” represent decoupled style features?

XiaojieGan commented 2 months ago

Oh,I think it may be the variable named "style_ctx_embeddings" which is attained by function "forward_ctx_embeddings",right?