Closed wj7486 closed 2 months ago
Hey @wj7486, thanks for your interest in our work.
You can refer to https://github.com/Weifeng-Chen/prompt2prompt for the implementation of cross attention visualization. Please give a star to the author in honor of the great work.
Besides, I am not quite sure about what you mean by mentioning time=0
. The y-axis of this figure is the total training steps, it won't produce the given dog if you don't fine-tune the model.
I hope I understand your point. If you have further questions, feel free to reach us!
Hey @wj7486, thanks for your interest in our work. You can refer to https://github.com/Weifeng-Chen/prompt2prompt for the implementation of cross attention visualization. Please give a star to the author in honor of the great work. Besides, I am not quite sure about what you mean by mentioning
time=0
. The y-axis of this figure is the total training steps, it won't produce the given dog if you don't fine-tune the model. I hope I understand your point. If you have further questions, feel free to reach us!
Thank you very much for your reply. The denoising sampling step ddim_step is usually 50 steps, and I understand that you should visualize it in the final step of image denoising, so the visualization of the crossover map is done when timestep=0, not at timestep=50 or intermediate steps. Thank you again for your reply. I will try it out.
Glad to hear that. Fell free to reopen this issue if you have further relevant questions about it.
Hello author, can you tell me how the crossattn map in the paper was drawn? Which layer of cross attn calculation result is specific at time t=0? If you could reply to me, I would greatly appreciate it!