karpathy / neuraltalk

NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences.
5.41k stars 1.32k forks source link

Python Caffe Features using Matlab like imresize #11

Closed ahmedosman closed 9 years ago

ahmedosman commented 9 years ago

Hi Andrej,

I am done with the matlab like imresize implementation (imresize function below). The output from the prepare_image_batch match the output from my python preprocess_image to the 4 decimal place ( because python can store decimals to a larger precision than matlab), Attached is a Histogram of error between matlab's output prepare_image_batch image and python's output preprocess_image. matlab_python_imresize.

Moreover I compared the final predictions from the new python script py_caffe_feat_extract.py with the new imresize in python and with caffe's imresize and compared them to matlab's prediction. Attached is a side by side histogram of error, the maximum discrepency with the new python imresize is 0.3 compared to +/-1.5 for the caffe image resize. Again I think if I limited python precision to 4 decimal place from the start , the residual error of 0.3 will go down to 0.

prediction_discrepency

**All these results are based on Caltech 101 dataset.

If you are ok with that script, then I'll submit another pull request making the changes we agreed on earlier to py_predict_images.py script

karpathy commented 9 years ago

This is awesome, thanks a lot for putting this together! You may want to see if the people Caffe might be interested in this snippet as well.

I didn't look in depth, but is it non-trivial to use 4 decimal place precision to get the python and matlab to match exactly?

ahmedosman commented 9 years ago

Turns out Python Numpy has a np.round function for rounding to the nearest nth decimal place. I'll do minor tweeks in the code to match matlab precision, that will bring down the residual error to 0. I'll do that and send another pull request.