Open 18445864529 opened 1 year ago
Thanks so much for checking out AttentionViz! We have not tried visualizing text-to-image attention yet but I think our tool/technique can feasibly be extended to vision-language models and this is definitely a great direction for future work.
First thank you for the great work!
I would like to know whether this tool can also do text-to-image attention views for large vision-language models such as MiniGPT-4, LLaVA, InstructBLIP, etc.?
Thanks!