dvlab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Apache License 2.0
3.21k stars 278 forks source link

article error #136

Open TuuSiwei opened 1 month ago

TuuSiwei commented 1 month ago

image I debug the code and think,the content in the picture should be N′ * N′ = H/4 ×W/4 = N ×M^2? @yanwei-li