Input layer problem; multi-source input case ; Faster R-CNN

Xiangyu-CAS commented 7 years ago

Hi，I am tryng to convert faster rcnn model from caffe1 format to caffe2. By runing caffe_transfer.py, I am able to obtain caffe2 model "init_net.pb" and "predict_net.pb".

However, I am a little confused by the new net file "predict_net.pb", all the layers (such as conv , pool, proposal )had been transfered except "Input" , and it seems blob "data" is the default input for network.

Moreover, when trying to run the pretrained model, a problem was encoutered. Input blob im_info is unknow even though I had already feed workspace the blob. I am wondering how could caffe2 deal with the case with multi-source input? For instance input= data + im_info .

with open("init_net.pb") as f: init_net = f.read() with open("predict_net.pb") as f: predict_net = f.read()

init_def.ParseFromString(init_net) net_def.ParseFromString(predict_net)

workspace.FeedBlob("data",img) workspace.FeedBlob("im_info",im_info)

workspace.RunNetOnce(init_def.SerializeToString()) workspace.CreateNet(net_def.SerializeToString())

> return C.create_net(StringifyProto(net), overwrite)
> RuntimeError: [enforce fail at net.cc:43] . op Proposal: **Source for input im_info is unknown for net** squeezenet-faster-rcnn, operator input: "rpn_cls_prob_reshape" input: "rpn_bbox_pred" input: "im_info" output: "rois" type: "Proposal"

thnkim commented 7 years ago

I have the same issue. Can anyone resolve this?

thnkim commented 7 years ago

I've just tentatively modified caffe_translator.py as follows:

trans_org.py: the original caffe_translator.py in github (0287567 on 25 May)
trans.py: my modification.

I hard-coded to specify my input sources. You can automate it: if op.input[0] == 'my_source1' or op.input[0] == 'my_source2' or op.input[0] == 'my_source3':

--- trans_org.py        2017-07-04 22:13:53.048208152 +0900
+++ trans.py    2017-07-04 22:11:02.100744802 +0900
@@ -129,7 +129,7 @@
     return TranslatorRegistry.TranslateModel(*args, **kwargs)

-def ConvertTensorProtosToInitNet(net_params, input_name):
+def ConvertTensorProtosToInitNet(net_params, input_names):
     """Takes the net_params returned from TranslateModel, and wrap it as an
     init net that contain GivenTensorFill.

@@ -149,7 +149,8 @@
                 utils.MakeArgument("shape", list(tensor.dims)),
                 utils.MakeArgument("values", tensor.float_data)])
         init_net.op.extend([op])
-    init_net.op.extend([core.CreateOperator("ConstantFill", [], [input_name], shape=[1])])
+    for input_name in input_names:
+        init_net.op.extend([core.CreateOperator("ConstantFill", [], [input_name], shape=[1])])
     return init_net

@@ -713,11 +714,17 @@
         caffenet, caffenet_pretrained, is_test=True
     )

+    external_input = []
+    for op in net.op:
+        if op.input[0] == 'my_source1' or op.input[0] == 'my_source2' or op.input[0] == 'my_source3':
+            print(op.input[0])
+            external_input.append(op.input[0])
+
     # Assume there is one input and one output
-    external_input = net.op[0].input[0]
+    #external_input = net.op[0].input[0]
     external_output = net.op[-1].output[0]

-    net.external_input.extend([external_input])
+    net.external_input.extend(external_input)
     net.external_input.extend([param.name for param in pretrained_params.protos])
     net.external_output.extend([external_output])
     init_net = ConvertTensorProtosToInitNet(pretrained_params, external_input)
@@ -728,3 +735,4 @@
         f.write(net.SerializeToString())
     with open(output_init_net, 'wb') as f:
         f.write(init_net.SerializeToString())
+

facebookarchive / caffe2

Input layer problem; multi-source input case ; Faster R-CNN #762