amazon-science / mm-cot

Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
https://arxiv.org/abs/2302.00923
Apache License 2.0
3.77k stars 309 forks source link

How are the vision features generated here ? How to view detr.npy and clip.npy images #52

Closed 1-sf closed 1 year ago

1-sf commented 1 year ago

I need help in understanding how the vision features are generated for this research ? I tried viewing images in detr.npy, clip.npy etc to understand what these images are using Image and matplotlib, but couldn't view those images meaningfully.

Screenshot 2023-04-09 at 11 25 41 AM

Need some help in understanding this

1-sf commented 1 year ago

closing this, I found they shared it in #45