IrohXu / VCog-Bench

What is the Visual Cognition Gap between Humans and Multimodal LLMs?
MIT License
4 stars 0 forks source link