karpathy / neuraltalk

NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences.
5.4k stars 1.32k forks source link

Caffe python wrapper #9

Closed ahmedosman closed 9 years ago

ahmedosman commented 9 years ago

Hi Andrej,

I wrote a a function to generate image features using the python caffe wrapper. The only source of discrepency I came across between your matlab code and the python code has to do with image resizing. Matlab imresize by default does cubic interpolation and by deafult does antialiasing correction, while the caff.io.reisze now does linear interpolation. When I dumped the matlab preprocssed images to disk and loaded them into python I got the exact same caffe predictions.

Currently the code contains hardcoded paramters (image mean, cropping dimensions) that you used in the matlab code.

Thanks for sharing your code.

Ahmed

karpathy commented 9 years ago

Hi thanks a lot for writing this up. I came to the same conclusion when I tried to move the code from Matlab to Python and never resolved it :(

As for the pull request I'd be eager to merge, but could we please make your version an independent script without changing predict on images file? In other words, there could be a predict_on_images_caffe.py, or something like that.

I'm worried that the Caffe dependency is a little too heavy. Or maybe there's some other more elegant way to isolate the Caffe part to prevent too much code duplication? E.g. this could be just a file that compartmentalizes all the Caffe functionality, and the predict on images script could have an option to use Caffe instead of .mat files?

Hmm

ahmedosman commented 9 years ago

Sure, I'll create a new script that could be independantly ran to generate image features and dump a .mat file on disk, and the predict on images script could optionally call the caffe method if the .mat file is not on disk. Sounds Good? thx, Ahmed

karpathy commented 9 years ago

Yep, that sounds good. However, when using Python features the results will be slightly different because of the image resizing, correct?

Also, caffe should not be a dependency of NeuralTalk, so it would be best if the import python_caffe_helper was wrapped inside the if statement, so the import only happens on demand if the .mat file doesn't exist.

ahmedosman commented 9 years ago

I'll write our own image resizing in python similar to what matlab resize do, so results should be identical in the end and submit a new pull request that reflects the changes above.

karpathy commented 9 years ago

If that worked that would be a super-valuable contribution not just to this but also Caffe. Thanks!