facebookarchive / caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
https://caffe2.ai
Apache License 2.0
8.42k stars 1.94k forks source link

No finetuning examples #1068

Open dyigitpolat opened 7 years ago

dyigitpolat commented 7 years ago

I have ilsvrc2012 dataset and VGG16 pretrained network. (init_net and predict_net files generated by the model translator) and I want to train/finetune it further with caffe2. why is this so hard and have no examples? I also have my previously working train_val.prototxt from caffe.

dyigitpolat commented 7 years ago

issue #18 is closed but there are still no examples on this one.

leovandriel commented 7 years ago

Hi, I agree on the lack of Python tutorials and examples. In the mean time, if you're okay with working in C++, take a look at this retrain example.

iteal commented 7 years ago

Hi! Does anyone have a python example for fine tuning?

I save a deploy model this way: deploy_model = model_helper.ModelHelper(init_params=False)

create deploy_model using brew

init_net, predict_net = mobile_exporter.Export(workspace, deploy_model.net, deploy_model.params)

Then I load this model that I saved in a file using the functions found in this issue 642 (https://github.com/caffe2/caffe2/issues/642)

EDIT : I found out the weights don't change anymore even after I add training operators, is there an example on how to add training operators to a model that has been loaded from init_net and predict_net files?

EDIT: I finally understood that the Adam optimizer was not being applied to my model because model.params was empty, in the end I added the line : train_model.params.extend([BlobReference(x) for x in predict_net.Proto().external_input if x != 'data']) So the final code to create a train_model from init_net and predict_net file is :

init_def = caffe2_pb2.NetDef()
 with open(init_net_path, 'r') as f:
        init_def.ParseFromString(f.read())
        init_def.device_option.CopyFrom(device_options)
net_def = caffe2_pb2.NetDef()
with open(predict_net_path, 'r') as f:
        net_def.ParseFromString(f.read())
        net_def.device_option.CopyFrom(device_options)

retrain_model = model_helper.ModelHelper(arg_scope=arg_scope)
predict_net = core.Net(net_def)
init_net = core.Net(init_def)
retrain_model.param_init_net.AppendNet(init_net)
retrain_model.net.AppendNet(predict_net)
retrain_model.params.extend([BlobReference(x) for x in predict_net.Proto().external_input if x != 'data'])

add_training_operators(retrain_model, 'pred', 'label')
workspace.RunNetOnce(retrain_model.param_init_net)
workspace.CreateNet(retrain_model.net, overwrite=True)

I hope there is an easier way, a function ?