fukun07 / neural-image-captioning

Using scene-specific contexts and region-based attention in neural image captioning
MIT License
44 stars 18 forks source link

ValueError: dimension mismatch in args to gemm (1,1064)x(1104,2048)->(1,2048) Apply node that caused the error: GpuDot22(GpuJoin.0, ss@lstm#w) #1

Closed mhashas closed 7 years ago

mhashas commented 7 years ago

I am experiencing this error when trying to run infer.py. Do you have any idea what's the issue and how I could fix it ?

error-neural

fukun07 commented 7 years ago

Are you running "infer.py" with the pre-trained captioning model? This error was caused by shape mismatch, since flickr8k uses 40dim scene vector and the provided model was trained on mscoco which uses 80dim scene vector. You can find in the error message that 1064 (512 lstm cell + 512 word + 40 flickr8k scene vector) and 1104 (512 lstm cell + 512 word + 80 mscoco scene vector) does not match.

To fix, you may change Line 40 from 'flickr8k' to 'mscoco'.

mhashas commented 7 years ago

Yes, you are correct. Thank you :)