Closed gadcam closed 5 years ago
Based on https://github.com/facebookresearch/Detectron/pull/372, models containing FPN can be correctly converted to caffe2's .pb files. (I will rebase the PR on master soon) However only detection net will be converted even in Mask R-CNN and Keypoint R-CNN which has mask net or keypoint net.
@daquexian I am really sorry but I think I failed to understand properly what you mean as I do not have a deep understanding how the Detectron repo works.
Do you mean that, when #372 will be merged, if we try to convert for example e2e_keypoint_rcnn_R-50-FPN_1x only the proposal part would be converted and so we could not use it on CPU ? If the answer to this question is yes, can you help us understand what steps we need to take to achieve a complete conversion ?
@gadcam If we try to convert e2e_keypoint_rcnn_R-50-FPN_1x, we will only get bounding boxes but not keypoint. Because in here only model.net
is used, but mask and keypoint are in model.mask_net
and model.keypoint_net
like it. The solution seems straightforward because there are only normal layers in these nets. But if you want to infer masks or keypoints after getting bounding boxes (in order to save inference time), it seems better to save these nets in different .pb files
@daquexian would you like to write a detail guild on how to change pkl to pb? Thanks
@HappyKerry Just fetch and checkout my branch
git remote add daquexian https://github.com/daquexian/Detectron
git fetch daquexian
git checkout daquexian/add-export-support-fpn
and run convert_pkl_to_pb.py
with your configuration files and weights
@daquexian I ran convert_pkl_to_pb.py
(with your patch) successfully on e2e_keypoint_rcnn_R-50-FPN_s1x
and on MSRA's original ResNet-50 model.
For e2e_keypoint_rcnn_R-50-FPN_s1x
I have no warning.
For MSRA's original ResNet-50 model I have the following output
Blob fpn_inner_res5_2_sum_w with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_inner_res5_2_sum_b with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_inner_res4_5_sum_lateral_w with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_inner_res4_5_sum_lateral_b with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_inner_res3_3_sum_lateral_w with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_inner_res3_3_sum_lateral_b with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_inner_res2_2_sum_lateral_w with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_inner_res2_2_sum_lateral_b with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res5_2_sum_w with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res5_2_sum_b with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res4_5_sum_w with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res4_5_sum_b with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res3_3_sum_w with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res3_3_sum_b with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res2_2_sum_w with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res2_2_sum_b with type <class 'str'> is not supported in generating init net, skipped.
Blob conv_rpn_fpn2_w with type <class 'str'> is not supported in generating init net, skipped.
Blob conv_rpn_fpn2_b with type <class 'str'> is not supported in generating init net, skipped.
Blob rpn_cls_logits_fpn2_w with type <class 'str'> is not supported in generating init net, skipped.
Blob rpn_cls_logits_fpn2_b with type <class 'str'> is not supported in generating init net, skipped.
Blob rpn_bbox_pred_fpn2_w with type <class 'str'> is not supported in generating init net, skipped.
Blob rpn_bbox_pred_fpn2_b with type <class 'str'> is not supported in generating init net, skipped.
Blob fc6_w with type <class 'str'> is not supported in generating init net, skipped.
Blob fc6_b with type <class 'str'> is not supported in generating init net, skipped.
Blob fc7_w with type <class 'str'> is not supported in generating init net, skipped.
Blob fc7_b with type <class 'str'> is not supported in generating init net, skipped.
Blob cls_score_w with type <class 'str'> is not supported in generating init net, skipped.
Blob cls_score_b with type <class 'str'> is not supported in generating init net, skipped.
Blob bbox_pred_w with type <class 'str'> is not supported in generating init net, skipped.
Blob bbox_pred_b with type <class 'str'> is not supported in generating init net, skipped.
If I try to convert model.keypoint_net
from e2e_keypoint_rcnn_R-50-FPN_s1x
I get
Blob fpn_res2_2_sum with type <class 'str'> is not supported in generating init net, skipped.
Blob keypoint_rois_fpn2 with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res3_3_sum with type <class 'str'> is not supported in generating init net, skipped.
Blob keypoint_rois_fpn3 with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res4_5_sum with type <class 'str'> is not supported in generating init net, skipped.
Blob keypoint_rois_fpn4 with type <class 'str'> is not supported in generating init net, skipped.
Blob fpn_res5_2_sum with type <class 'str'> is not supported in generating init net, skipped.
Blob keypoint_rois_fpn5 with type <class 'str'> is not supported in generating init net, skipped.
Blob keypoint_rois_idx_restore_int32 with type <class 'str'> is not supported in generating init net, skipped.
So I have a few questions
Blob ____ is not supported
for the keypoint model when we have some for the ResNet ?Blob ____ is not supported
in the ResNet ? Should we implement these operators ? (I thought the ResNet would be converted without trouble)e2e_keypoint_rcnn_R-50-FPN_s1x
@daquexian Then that is perfect : I did use the models in the Model Zoo. To be accurate what I call
e2e_keypoint_rcnn_R-50-FPN_s1x
is https://s3-us-west-2.amazonaws.com/detectron/37697714/12_2017_baselines/e2e_keypoint_rcnn_R-50-FPN_s1x.yaml.08_44_03.qrQ0ph6M/output/train/keypoints_coco_2014_train%3Akeypoints_coco_2014_valminusminival/generalized_rcnn/model_final.pkl (so in section End-to-End Keypoint-Only Mask R-CNN Baselines
)Why did you suspect I tried to convert something else ? Because I have some Blob ____ is not supported
when I should not ?
@gadcam Yes. It is reasonable that Blob ____ is not supported
appears when you use an ImageNet pretrained model, because fpn, rpn and some other layers are not in ImageNet pretrained models.
Could you please tell me what ops not supported output
means?
@daquexian
Could you please tell me what ops not supported output means?
I meant Blob ____ is not supported
I am sorry for my inaccuracy. (I corrected it)
It is reasonable that Blob ____ is not supported appears when you use an ImageNet pretrained model, because fpn, rpn and some other layers are not in ImageNet pretrained models.
I am not sure I got this part : do you mean that when we see Blob ____ is not supported
it means the Blob
needs some code from the Detectron to be fully defined ?
So I think we are getting to the point of my issue : what should we implement to avoid it ? Or can you direct me where to dive to know what we need to implement ?
If we take an example (but we could say the same thing for keypoint_rois_idx_restore_int32
)
Blob keypoint_rois_fpn2 with type <class 'str'> is not supported in generating init net, skipped.
The only mention I found of keypoint_rois_fpn
in the code is here https://github.com/facebookresearch/Detectron/blob/b3c93df2cecca1139f73d005b9dfcd83ef55c16d/detectron/roi_data/fast_rcnn.py#L103
So I do not really know where to investigate to avoid this Blob ____ is not supported
error.
As a side question should we implement something like https://github.com/facebookresearch/Detectron/blob/e5bb3a8ff0b9caf59c76037726f49465d6b9678b/detectron/ops/generate_proposal_labels.py#L30 in Caffe2/PyTorch repo and then add some conversion code here to get full CPU support ?
@gadcam Blob ____ is not supported
here just indicates that the blob doesn't have any value (I don't know why its type will be 'str' when it doesn't have any value, caffe2 is strange). There is no more layers needed to implement. You can add the name of these blobs into empty_blobs like
('data' and 'im_info' are the inputs of model.net, 'fpn_res2_2_sum', 'keypoint_rois_fpn2' and so on are the inputs of model.keypoint_net)
The converted model will crash when you try to verify it. Because its inputs are not legal. Maybe giving it some proper inputs ('fpn_res2_2_sum' and so on produced by bbox branch, and also "keypoint_rois_fpnX" below) will make it run.
@daquexian Thank you for your hints, with a bit of work I was able to run e2e_keypoint_rcnn_R-50-FPN_s1x
on CPU !
I will tidy up my code before sharing it.
If I am able to write something clean enough I will do a PR to enable conversion of keypoints and mask-models with test to check the correctness of the conversion. (and so an example of how to run it)
For the moment the main problem is that I could not pick programmatically the input blobs.
@gadcam Great! Looking forward to your PR
@gadcam Hi, are we able to convert the Mask R-CNN model from .pkl
to .pb
now?
@dongmingsun With @daquexian's #372 + my (future) PR you will be able to convert the models from the Zoo from .pkl
to two .pb
files, one for the bbox and one for the mask or keypoints, and you would need to use some helper function to run them.
What I achieved is to run it without the need of a GPU, not to have a pure Caffe2 model.
I think someone more experimented than me would be able to merge these two .pb
files at least. I will investigate quickly this option.
@gadcam Thank you very much, so I still have to figure out how to feed a Detectron model to pure Caffe2 C++.
@gadcam Hi, do you encounter this problem when you ran convert_pkl_to_pb.py
in @daquexian .
Cannot find operator schema for CollectAndDistributeFpnRpnProposals. Will skip schema checking. Traceback for operator 164 in network origin_model Traceback (most recent call last): File "tools/convert_pkl_to_pb.py", line 637, in <module> main() File "tools/convert_pkl_to_pb.py", line 631, in main verify_model(args, [net, init_net], args.test_img) File "tools/convert_pkl_to_pb.py", line 569, in verify_model _run_cfg_func, _run_pb_func, test_img, check_blobs) File "/alpha/Rddd/projects/detectron0518/Detectron/detectron/utils/model_convert_utils.py", line 367, in compare_model res2 = model2_func(test_image, check_blobs) File "tools/convert_pkl_to_pb.py", line 565, in _run_pb_func return run_model_pb(args, model_pb[0], model_pb[1], im, check_blobs) File "tools/convert_pkl_to_pb.py", line 505, in run_model_pb workspace.CreateNet(net) File "/home/Rddd/data/projects/caffe2/build-cpu/caffe2/python/workspace.py", line 163, in CreateNet StringifyProto(net), overwrite, File "/home/Rddd/data/projects/caffe2/build-cpu/caffe2/python/workspace.py", line 189, in CallWithExceptionIntercept return func(*args, **kwargs) RuntimeError: [enforce fail at operator.cc:191] op. Cannot create operator of type 'CollectAndDistributeFpnRpnProposals' on the device 'CPU'. Verify that implementation for the corresponding device exist. It might also happen if the binary is not linked with the operator implementation code. If Python frontend is used it might happen if dyndep.InitOpsLibrary call is missing.
Hi @kundalee , it seems that your caffe2 version is not the latest. You might want to pull the latest code from https://github.com/pytorch/pytorch and recompile it.
On Wed, May 23, 2018, 12:10 PM Kunda notifications@github.com wrote:
@gadcam https://github.com/gadcam Hi, do you encounter this problem when you ran convert_pkl_to_pb.py in @daquexian https://github.com/daquexian .
Cannot find operator schema for CollectAndDistributeFpnRpnProposals. Will skip schema checking. Traceback for operator 164 in network origin_model Traceback (most recent call last): File "tools/convert_pkl_to_pb.py", line 637, in main() File "tools/convert_pkl_to_pb.py", line 631, in main verify_model(args, [net, init_net], args.test_img) File "tools/convert_pkl_to_pb.py", line 569, in verify_model _run_cfg_func, _run_pb_func, test_img, check_blobs) File "/alpha/Rddd/projects/detectron0518/Detectron/detectron/utils/model_convert_utils.py", line 367, in compare_model res2 = model2_func(test_image, check_blobs) File "tools/convert_pkl_to_pb.py", line 565, in _run_pb_func return run_model_pb(args, model_pb[0], model_pb[1], im, check_blobs) File "tools/convert_pkl_to_pb.py", line 505, in run_model_pb workspace.CreateNet(net) File "/home/Rddd/data/projects/caffe2/build-cpu/caffe2/python/workspace.py", line 163, in CreateNet StringifyProto(net), overwrite, File "/home/Rddd/data/projects/caffe2/build-cpu/caffe2/python/workspace.py", line 189, in CallWithExceptionIntercept return func(*args, kwargs) RuntimeError: [enforce fail at operator.cc:191] op. Cannot create operator of type 'CollectAndDistributeFpnRpnProposals' on the device 'CPU'. Verify that implementation for the corresponding device exist. It might also happen if the binary is not linked with the operator implementation code. If Python frontend is used it might happen if dyndep.InitOpsLibrary call is missing. Operator def: input: "rpn_rois_fpn2" input: "rpn_rois_fpn3" input: "rpn_rois_fpn4" input: "rpn_rois_fpn5" input: "rpn_rois_fpn6" input: "rpn_roi_probs_fpn2" input: "rpn_roi_probs_fpn3" input: "rpn_roi_probs_fpn4" input: "rpn_roi_probs_fpn5" input: "rpn_roi_probs_fpn6" output: "rpn_rois" output: "rois_fpn2" output: "rois_fpn3" output: "rois_fpn4" output: "rois_fpn5" output: "rois_idx_restore_int32" name: "" type: "CollectAndDistributeFpnRpnProposals" arg { name: "roi_max_level" i: 5 } arg { name: "rpn_post_nms_topN" i: 1000 } arg { name: "roi_canonical_scale" i: 224 } arg { name: "rpn_min_level" i: 2 } arg { name: "roi_canonical_level" i: 4 } arg { name: "roi_min_level" i: 2 } arg { name: "rpn_max_level" i: 6 } device_option { } engine: "" debug_info: " File "tools/convert_pkl_to_pb.py", line 637, in \n main()\n File "tools/convert_pkl_to_pb.py", line 607, in main\n convert_net(args, net.Proto(), blobs)\n File "tools/convert_pkl_to_pb.py", line 279, in convert_net\n convert_op_in_proto(net, convert_python)\n File "/alpha/Rddd/projects/detectron0518/Detectron/detectron/utils/model_convert_utils.py", line 113, in convert_op_in_proto\n convert_op_in_ops(proto.op, func_or_list)\n File "/alpha/Rddd/projects/detectron0518/Detectron/detectron/utils/model_convert_utils.py", line 102, in convert_op_in_ops\n new_ops = func(op)\n File "/alpha/Rddd/projects/detectron0518/Detectron/detectron/utils/model_convert_utils.py", line 76, in wrapper\n return f(op, params)\n File "tools/convert_pkl_to_pb.py", line 250, in convert_python\n rpn_post_nms_topN=cfg.TEST.RPN_POST_NMS_TOP_N,\n File "tools/convert_pkl_to_pb.py", line 158, in convert_collect_and_distribute\n rpn_post_nms_topN=rpn_post_nms_topN,\n"
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/Detectron/issues/432#issuecomment-391214105, or mute the thread https://github.com/notifications/unsubscribe-auth/ALEcn4FWwcu0YBA_OdHH_XwLx1Ba1Wogks5t1OErgaJpZM4T_faM .
@gadcam @daquexian @dongmingsun I have changed pkl model to pb model,but how to use pb model in caffe2 C++? Thanks
@HappyKerry you can search for caffe2 android demo or thiry-party tutorials
@HappyKerry caffe2_cpp_tutorial might helps.
Hi @daquexian Thank you very much. Because of you comments, the problem CollectAndDistributeFpnRpnProposals
is solved. I have already converted .pkl
to .pb
successfully.
But when I try to load the pb files for testing on CPU, i get this problem below. Everything is fine until I call this functionworkspace.CreateNet(net)
.
workspace.CreateNet(net) File "/home/Rddd/data/projects/pytorch/build/caffe2/python/workspace.py", line 152, in CreateNet StringifyProto(net), overwrite, File "/home/Rddd/data/projects/pytorch/build/caffe2/python/workspace.py", line 178, in CallWithExceptionIntercept return func(*args, **kwargs) RuntimeError: [enforce fail at operator.cc:185] op. Cannot create operator of type 'BatchPermutation' on the device 'CPU'. Verify that implementation for the corresponding device exist. It might also happen if the binary is not linked with the operator implementation code. If Python frontend is used it might happen if dyndep.InitOpsLibrary call is missing. Operator def: input: "roi_feat_shuffled" input: "rois_idx_restore_int32" output: "roi_feat" name: "" type: "BatchPermutation" device_option { } engine: ""
I have noticed that the function named verify_model
after converting. It works well and no error occurred. Can someone tell me how to use pb model in caffe2 python? Thanks
@kundalee BatchPermutation
is in a caffe2 module. You need load the module in your code like https://github.com/facebookresearch/Detectron/blob/e5bb3a8ff0b9caf59c76037726f49465d6b9678b/detectron/utils/c2.py#L42 or this tutorial.
And I haven't find how to load module in c++. No one responds to my issue (It's so normal :D) So I compiled the detectron ops into caffe2 main library as a workaround.
@daquexian I met the same "BatchPermutation"problem as @kundalee, So how to compile the detectron ops into caffe2 main library ?
@HappyKerry Just copy detectron ops into the main caffe2 ops directory and recompile.
@dongmingsun @daquexian
I still have to figure out how to feed a Detectron model to pure Caffe2 C++.
I think someone more experimented than me would be able to merge these two .pb files at least. I will investigate quickly this option.
Assuming that #372 & #449 are correct and merged. The main problem I see to do one of these two things is that we could put all the ops in the same net but we would need to write something like this just before inference:
def run_model_pb(args, models_pb, im, check_blobs):
workspace.ResetWorkspace()
net, init_net = models_pb
workspace.RunNetOnce(init_net)
mutils.create_input_blobs_for_net(net.Proto())
workspace.CreateNet(net)
input_blobs = _prepare_blobs(
im,
cfg.PIXEL_MEANS,
cfg.TEST.SCALE, cfg.TEST.MAX_SIZE
)
boxes = ????
if cfg.MODEL.MASK_ON:
im_scale = input_blobs['im_info'][0][2]
mask_rois = {'mask_rois': test._get_rois_blob(boxes, im_scale)}
# Add multi-level rois for FPN
if cfg.FPN.MULTILEVEL_ROIS:
test._add_multilevel_rois_for_test(mask_rois, 'mask_rois')
input_blobs.update(keypoints_rois)
if cfg.MODEL.KEYPOINTS_ON:
im_scale = input_blobs['im_info'][0][2]
keypoints_rois = { 'keypoint_rois': test._get_rois_blob(boxes, im_scale)}
# Add multi-level rois for FPN
if cfg.FPN.MULTILEVEL_ROIS:
test._add_multilevel_rois_for_test(input_blobs, 'keypoint_rois')
input_blobs.update(keypoints_rois)
But we can not know boxes before inference... So do we have to run this in two steps if we want to keep the exact same architecture or am I missing something ? So @dongmingsun I think you have to do like in my PR : running first stage, "Add multi-level rois for FPN", running second stage & process the result.
As a sidenote why do we keep cfg.FPN.MULTILEVEL_ROIS
if it is set to TRUE
in all the CFG files ?
@daquexian
Hello. I am new to caffe2 and Detectron. I trained a model of detectron and want to test it in caffe2. Since the current branch of detectron does not support FPN conversion, so I search around and found your branch.
I try to use your code to convert my pkl model to pb files. The model is based on Detectron tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml
and trained with my own dataset.
I tried both gpu and cpu mode and got the following error. This one is got in CPU mode:
WARNING workspace.py: 185: Original python traceback for operator '121' in network 'detectron' in exception above (most recent call last):
Running pb model failed.
[enforce fail at upsample_nearest_op.h:39] . Not Implemented. Error from operator:
input: "fpn_inner_res5_2_sum" output: "fpn_inner_res4_5_sum_topdown" name: "" type: "UpsampleNearest" arg { name: "scale" i: 2 } device_option { } engine: ""
Checking result_boxes -> result_boxes...
Traceback (most recent call last):
File "/detectron/tools/convert_pkl_to_pb.py", line 637, in <module>
main()
File "/detectron/tools/convert_pkl_to_pb.py", line 631, in main
verify_model(args, [net, init_net], args.test_img)
File "/detectron/tools/convert_pkl_to_pb.py", line 569, in verify_model
_run_cfg_func, _run_pb_func, test_img, check_blobs)
File "/detectron/detectron/utils/model_convert_utils.py", line 379, in compare_model
n1, n2, r1.shape, r2.shape)
AssertionError: Blob result_boxes and result_boxes shape mismatched: (9, 5) vs (0, 5)
Process finished with exit code 1
This one is got in GPU mode:
WARNING cnn.py: 25: [====DEPRECATE WARNING====]: you are creating an object from CNNModelHelper class which will be deprecated soon. Please use ModelHelper object with brew module. For more information, please refer to caffe2.ai and python/brew.py, python/brew_test.py for more information.
INFO net.py: 59: Loading weights from: result50/model_iter19999.pkl
I0626 12:01:25.666318 29776 net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 0.000106857 secs
I0626 12:01:25.666505 29776 net_dag.cc:46] Number of parallel execution chains 63 Number of operators = 232
I0626 12:01:25.675417 29776 net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 8.2534e-05 secs
I0626 12:01:25.675545 29776 net_dag.cc:46] Number of parallel execution chains 30 Number of operators = 188
Running the second model...
Checking result_boxes -> result_boxes...
Traceback (most recent call last):
File "/detectron/tools/convert_pkl_to_pb.py", line 637, in <module>
main()
File "/detectron/tools/convert_pkl_to_pb.py", line 631, in main
verify_model(args, [net, init_net], args.test_img)
File "/detectron/tools/convert_pkl_to_pb.py", line 569, in verify_model
_run_cfg_func, _run_pb_func, test_img, check_blobs)
File "/detectron/detectron/utils/model_convert_utils.py", line 384, in compare_model
n1, n2, np.amax(np.absolute(r1 - r2))))
File "/usr/local/lib/python2.7/dist-packages/numpy/testing/nose_tools/utils.py", line 963, in assert_array_almost_equal
precision=decimal)
File "/usr/local/lib/python2.7/dist-packages/numpy/testing/nose_tools/utils.py", line 779, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 3 decimals
result_boxes and result_boxes not matched. Max diff: 4.39031982422
(mismatch 11.1111111111%)
x: array([[7.503e+02, 3.873e+02, 8.095e+02, 4.501e+02, 9.987e-01],
[1.055e+03, 3.291e+02, 1.113e+03, 3.970e+02, 9.965e-01],
[8.385e+02, 3.726e+02, 8.958e+02, 4.344e+02, 9.940e-01],...
y: array([[7.503e+02, 3.873e+02, 8.095e+02, 4.501e+02, 9.987e-01],
[1.055e+03, 3.291e+02, 1.113e+03, 3.970e+02, 9.965e-01],
[8.385e+02, 3.726e+02, 8.958e+02, 4.344e+02, 9.940e-01],...
Process finished with exit code 1
Do you have any idea how I can fix these? Thanks.
@daquexian I change the input image dimension and everything works well now! Still thank you.
When I run
python tools/convert_pkl_to_pb.py --cfg mm/noaug_2gpu_e2e_faster_rcnn_R-101-FPN.yaml --out_dir ttt --test_img 01.jpg --fuse_af 0 --device cpu
,
it raised the following error:
AssertionError: Blob result_boxes and result_boxes shape mismatched: (195, 5) vs (117, 5)
I trained the faster-rcnn model using the pretrained imagenet model, R-101.pkl. Could anybody give me some advice? Thanks.
@gadcam Thank you for the effort.
@dongmingsun With @daquexian's #372 + my (future) PR you will be able to convert the models from the Zoo from
.pkl
to two.pb
files, one for the bbox and one for the mask or keypoints, and you would need to use some helper function to run them. What I achieved is to run it without the need of a GPU, not to have a pure Caffe2 model. I think someone more experimented than me would be able to merge these two.pb
files at least. I will investigate quickly this option.
How to use the pb files in python? is there somewhere a tutorial/example? (i couldn't find something useful yet)
Hi! @daquexian @HappyKerry @kundalee
I use convert_pkl_to_pb.py
to convert the detectron model to caffe2 model successfully. Then I want to use ONNX to convert the caffe2 model to ONNX model.
I encounter the same issue as above:
WARNING:caffe2.python.workspace:Original python traceback for operator `170` in network `detectron` in exception above (most recent call last):
Traceback (most recent call last):
File "/home/user/pycharm-2018.1.3/helpers/pydev/pydevd.py", line 1664, in <module>
main()
File "/home/user/pycharm-2018.1.3/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/user/pycharm-2018.1.3/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/user/backup/lichu/onnx_convert/caffe2_onnx.py", line 24, in <module>
value_info,
File "/usr/local/lib/python2.7/dist-packages/caffe2/python/onnx/frontend.py", line 332, in caffe2_net_to_onnx_model
model = make_model(cls.caffe2_net_to_onnx_graph(*args, **kwargs),
File "/usr/local/lib/python2.7/dist-packages/caffe2/python/onnx/frontend.py", line 221, in caffe2_net_to_onnx_graph
inputs)
File "/usr/local/lib/python2.7/dist-packages/caffe2/python/onnx/helper.py", line 62, in c2_native_run_net
ws.RunNetOnce(predict_net)
File "/usr/local/lib/python2.7/dist-packages/caffe2/python/onnx/workspace.py", line 63, in f
return getattr(workspace, attr)(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/caffe2/python/workspace.py", line 199, in RunNetOnce
StringifyProto(net),
File "/usr/local/lib/python2.7/dist-packages/caffe2/python/workspace.py", line 178, in CallWithExceptionIntercept
return func(*args, **kwargs)
RuntimeError: [enforce fail at operator.cc:185] op. Cannot create operator of type 'BatchPermutation' on the device 'CPU'. Verify that implementation for the corresponding device exist. It might also happen if the binary is not linked with the operator implementation code. If Python frontend is used it might happen if dyndep.InitOpsLibrary call is missing. Operator def: input: "roi_feat_shuffled_1" input: "rois_idx_restore_int32_1" output: "roi_feat_1" name: "" type: "BatchPermutation" device_option { } engine: ""
Does it mean that BatchPermutation
can't be found in caffe2? what should I do? thanks!
Hey I'm running
python detectron/tools/convert_pkl_to_pb.py --out_dir /app/out --cfg /app/detectron/configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_1x.yaml --device cpu TEST.WEIGHTS model_final.pkl
and getting the following error.
Traceback (most recent call last):
File "detectron/tools/convert_pkl_to_pb.py", line 654, in <module>
main()
File "detectron/tools/convert_pkl_to_pb.py", line 612, in main
model, blobs = load_model(args)
File "detectron/tools/convert_pkl_to_pb.py", line 420, in load_model
model = test_engine.initialize_model_from_cfg(cfg.TEST.WEIGHTS)
File "/app/detectron/detectron/core/test_engine.py", line 330, in initialize_model_from_cfg
model, weights_file, gpu_id=gpu_id,
File "/app/detectron/detectron/utils/net.py", line 112, in initialize_gpu_from_weights_file
src_blobs[src_name].astype(np.float32, copy=False))
File "/app/pytorch/build/caffe2/python/workspace.py", line 317, in FeedBlob
return _Workspace_feed_blob(ws, name, arr, device_option)
File "/app/pytorch/build/caffe2/python/workspace.py", line 654, in _Workspace_feed_blob
return ws.create_blob(name).feed(arr, device_option)
File "/app/pytorch/build/caffe2/python/workspace.py", line 676, in _Blob_feed
return blob._feed(arg, device_option)
RuntimeError: [enforce fail at pybind_state.cc:348] feeder. Unknown device type encountered in FeedBlob.
I've built caffe2 with CPU-only support. Is this going to be a deal breaker? Should I fire up the GPU version and convert to PB with that? Looks like one needs GPU support to convert from pkl to pb. Is this assumption I'm making correct?
Hi,I have converted .pkl model to .pb model under ubuntu16.04, and I want to use .pb in c++ windows.Do I need to install caffe2 under windows according to the tutorial https://caffe2.ai/docs/get-start.html? Platform=windows&configuration=compile? @HappyKerry @daquexian @dongmingsun
@lilichu
Does it mean that BatchPermutation can't be found in caffe2? what should I do? thanks!
Refer to this comment.
Is this finished?
Hello @fuzzyBatman,
To be honest I do not know what is the current state of the Detectron. I closed this issue because I felt it was not useful any more as it did not get enough attention in the last months.
Hey I'm running
python detectron/tools/convert_pkl_to_pb.py --out_dir /app/out --cfg /app/detectron/configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_1x.yaml --device cpu TEST.WEIGHTS model_final.pkl
and getting the following error.
Traceback (most recent call last): File "detectron/tools/convert_pkl_to_pb.py", line 654, in <module> main() File "detectron/tools/convert_pkl_to_pb.py", line 612, in main model, blobs = load_model(args) File "detectron/tools/convert_pkl_to_pb.py", line 420, in load_model model = test_engine.initialize_model_from_cfg(cfg.TEST.WEIGHTS) File "/app/detectron/detectron/core/test_engine.py", line 330, in initialize_model_from_cfg model, weights_file, gpu_id=gpu_id, File "/app/detectron/detectron/utils/net.py", line 112, in initialize_gpu_from_weights_file src_blobs[src_name].astype(np.float32, copy=False)) File "/app/pytorch/build/caffe2/python/workspace.py", line 317, in FeedBlob return _Workspace_feed_blob(ws, name, arr, device_option) File "/app/pytorch/build/caffe2/python/workspace.py", line 654, in _Workspace_feed_blob return ws.create_blob(name).feed(arr, device_option) File "/app/pytorch/build/caffe2/python/workspace.py", line 676, in _Blob_feed return blob._feed(arg, device_option) RuntimeError: [enforce fail at pybind_state.cc:348] feeder. Unknown device type encountered in FeedBlob.
I've built caffe2 with CPU-only support. Is this going to be a deal breaker? Should I fire up the GPU version and convert to PB with that? Looks like one needs GPU support to convert from pkl to pb. Is this assumption I'm making correct?
Hello, did anyone come across this error? I did when I tried to run CPU-only C3D extraction.
It looks like many people are asking for CPU inference and it seems it needs much work to make it happen. What I offer is that we use this issue to publicly state what work is needed and so people eager to have this feature could easily help to implement it.
@daquexian, @orionr, @rbgirshick do you have time to share a list of features / ops needed to convert all the models with convert_pkl_to_pb.py ?
I would like to contribute to this effort but I do not know where to begin. If you are willing to implement a feature do not hesitate to tell it in this issue.
Ps: To avoid any confusion I am only a random user of the Detectron & my initiative was not solicited by the maintainers