Lasagne / Recipes

Lasagne recipes: examples, IPython notebooks, ...
MIT License
914 stars 419 forks source link

ResNet-50 from Caffe to Lasagne #73

Closed mephistopheies closed 8 years ago

mephistopheies commented 8 years ago

Hi, I made script for transferring weights from caffe ImageNet pretrained ResNet-50 to lasagne https://github.com/mephistopheies/resnet50caffe2lasagne/blob/master/resnet-50.ipynb, like in existing one https://github.com/Lasagne/Recipes/blob/master/examples/Using%20a%20Caffe%20Pretrained%20Network%20-%20CIFAR10.ipynb for VGG.

Would it be useful to have such recipe script for "modelzoo" folder? and/or "examples" folder?

ebenolson commented 8 years ago

Absolutely! Your ipython notebook looks like a great addition to the examples, and if you also want to extract just the model building function into a python script that would be good for modelzoo.

f0k commented 8 years ago

+1 for porting the pretrained weights and comparing results of the implementations!

A minor comment on the batch normalization: When you call inv_std.set_value, you should add epsilon inside of the square root. Often this is 1e-4, but I don't know what was used in the original model. This could make a small difference for the predictions, check if they get closer with this modification.

And a larger comment on the code: Your setup for recreating the layers seems a bit complicated. Can't you base your code on https://github.com/Lasagne/Recipes/blob/master/papers/deep_residual_learning/Deep_Residual_Learning_CIFAR-10.py#L93? If the architecture hasn't drastically changed for the caffe model, there's no need to specify each complex block on its own; the architecture follows some simple rules that can be cast as helper functions!

mephistopheies commented 8 years ago

I made changes, @f0k you were right about overcomplicated code, I found out what pattern is used in building deep residual network, and now conf dictionaries are removed. Now I have one function for "conv -> bn -> (relu)" and one for creation residual block. As for 1e-4, I tried it before, and difference is not sufficient.

Here is new files:

I also added "files" folder into "examples".

If you have any comments, plz write, I will fix.

erfannoury commented 8 years ago

Are you also planning on providing the extracted weight file? (like other models in modelzoo)

mephistopheies commented 8 years ago

@erfannoury do you have any shared place to store binary files (I noticed that all models are stored somewhere in amazon), or I can upload it anywhere?

f0k commented 8 years ago

Cool, nice work!

Now I have one function for "conv -> bn -> (relu)" and one for creation residual block.

Looks correct, skimming over http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006.

If you have any comments, plz write, I will fix.

Instead of lasagne.layers.get_output_shape(incoming_layer), you can just do incoming_layer.output_shape. But I don't know if it's worth fixing, it does the same. In the notebook, you sometimes write residial or resudual.

Instead of a files subdirectory, could you create a resnet50 subdirectory in examples and just move all your files (including the notebook) there? I think this is a bit tidier. examples/files/images/resnet50 is a hierarchy we probably don't want to start :) You can also move the notebook to examples/resnet50 and the images to examples/resnet50/images if you want to reduce clutter -- just make sure to keep everything in examples/resnet50.

do you have any shared place to store binary files (I noticed that all models are stored somewhere in amazon), or I can upload it anywhere?

If you upload it somewhere I can download it, I (or @ebenolson) can upload it to our amazon S3 store so you can add a link to it.

mephistopheies commented 8 years ago

done: https://github.com/mephistopheies/Recipes/tree/resnet50

here is binary with model: https://drive.google.com/file/d/0B4bl7YMqDnViZF90WGd0dG5ZTlE/view?usp=sharing

f0k commented 8 years ago

All right, it's uploaded at: https://s3.amazonaws.com/lasagne/recipes/pretrained/imagenet/resnet50.pkl Just add a link to it at the top of the modelzoo file (just like the other modelzoo files have it), and also include a link to the license if applicable (check Kaiming's repository again). Then send us a pull request. Thank you!

PS: When pickling, you should always add protocol=-1 to the dump() call. For your file, this reduced the size from 281 MB to 99 MB and the loading (unpickling) time from 10 seconds to 40 milliseconds!

mephistopheies commented 8 years ago

here is pull request https://github.com/Lasagne/Recipes/pull/76

ane new smaller serialization of the model: https://drive.google.com/file/d/0B4bl7YMqDnViRDNIbmtWaWR6ems/view?usp=sharing

f0k commented 8 years ago

ane new smaller serialization of the model

Is this any different from the one I uploaded? I had already re-pickled your file with protocol=-1.

mephistopheies commented 8 years ago

no, it is same

nshreyasvi commented 5 years ago

Hello, Where can I download resnet50.pkl pre-trained model as none of the links seem to be working