Can Detectron model be converted to Caffe model?

dedoogong commented 6 years ago

I want to run Keypoint RCNN on an embedded board(not TX1 or 2, but another powerful DSP board that can run RFCN or Faster RCNN model in real time) after model quantization and some optimization. Unfortunately it supports only Caffe 1.

After merging PR #372 and #449, I'm struggling to convert Keypoint-R-CNN example model to Caffe. By default, Detectron divides the model into "base" part and "head" part, but I made a new Caffe prototxt including both keypoint net and net(base net) after extracting detectron_net.pbtxt and detectron_keypoint_net_pbtxt together and manually changed each name of items.

Also, I found it's possible to convert all of Caffe2's weight/bias to numpy arrays and then those to Caffe using FetchBlob("weight name").

But now, I feel confused that it seems like, there would be no matching CUDA/cuDNN implementation in Caffe2 for Caffe's "Deconvolution(gpu)" code( Does "ConvTranspose" work same as "Deconvolution"?) Should I port each newly implemented gpu operators in Caffe2 to Caffe's layer? BatchPermutation gpu operator in Caffe2 also doesn't exist in Caffe. Oh, I don't want to open Hell Gate.

Plus, some of Caffe's hyper parameters in prototxt are gone in Caffe2's pbtxt such as " num_output" or "lr_mult" in convolution_param. How can I infer num_output values? or how can I apply lr_mult /decay_mult? Can I just ignore those old version parameters?

Not only that, Detectron's convert_pkl_to_pb.py converts the example weight file(.pkl, arond 480MB) to so small 2 .pb files(less than 200KB in total) with fusing AffineTransform option. If I apply this fusing option, it merge some blobs, so the original network structure is changed and reduced. As there is no AffineTransform layer in Caffe, I need to turn on this option.

So, I would like someone to tell me that Caffe2 can be translated to Caffe or not. Is there anybody who succeed in converting Caffe2 model to Caffe? I'm also considering ONNX or MMdnn, but it seems like, those converting tools insert too many interface layers that make it run so slowly! So, I want to convert directly from Detectron/Caffe2 to Caffe using numpy!

Thank you in advance!

System information

Operating system: ubuntu 16.04
Compiler version: gcc 5.4.0
CUDA version: 9.2
cuDNN version: 7.1.4
NVIDIA driver version: 396.45
GPU models (for all devices if they are not all the same): GTX 1080
PYTHONPATH environment variable: /home/lee/caffe2-pytorch/build:/usr/local:/usr/local/lib/python3.5/site-packages:/usr/local/lib/python3.5/dist-packages:/home/lee/Downloads/light_head_rcnn/lib:/home/lee/Downloads/pyRFCN/py-R-FCN/lib:/home/lee/Downloads/pyRFCN/py-R-FCN/caffe/python:
python --version output: python3.5

gadcam commented 6 years ago

@dedoogong First I have to say I do not know a lot of things about this issue. For me your problem should be decomposed in these two steps

Get a Caffe2 model
Convert this Caffe2 model to a Caffe model

1. Get a Caffe2 model

Not only that, Detectron's convert_pkl_to_pb.py converts the example weight file (.pkl, arond 480MB) to so small 2 .pb files(less than 200KB in total) with fusing AffineTransform option. If I apply this fusing option, it merge some blobs, so the original network structure is changed and reduced. As there is no AffineTransform layer in Caffe, I need to turn on this option.

Could you provide the exact command you you run to convert the pkl ? This seems very suspicious to me, and if something is wrong I want to correct my PR.

2. Convert this Caffe2 model to a Caffe model

Also, I found it's possible to convert all of Caffe2's weight/bias to numpy arrays and then those to Caffe using FetchBlob("weight name").

Can you provide a sample code on how to do that ? How do you keep the structure of the net ? Is it the prototxt file ? Is it sufficient ?

For what it's worth, this comment seems to show a sort of solution here but it does not look like there is a direct and obvious way. https://github.com/caffe2/caffe2/issues/641#issuecomment-406715565

dedoogong commented 6 years ago

@gadcam thank you for your detailed reply.

Current Status:

1. Get a Caffe2 model

Could you provide the exact command you you run to convert the pkl ?

my command is :
convert_pkl_to_pb.py --cfg /home/lee/detectron/configs/12_2017_baselines/e2e_keypoint_rcnn_R-50-FPN_s1x.yaml --out_dir . --device cpu TEST.WEIGHTS in the yaml file is referencing "/home/lee/detectron/model_final.pkl(480MB)"

results: detectron_keypoint_net.pb(3.9kB) detectron_keypoint_net_init(29byte) detectron_net.pb(27.1kB) detectron_net_init.pb(20byte)

Your PR looks good, maybe there is my mistake. By the way, when I downloaded some demo example pb files here, those are also so small like that. Maybe the small size is a nomal result?

2. 3rd converting tools

Can you provide a sample code on how to do that ? How do you keep the structure of the net ? Is it the prototxt file ? Is it sufficient ?

I just meant that I can access each weights of layers and handle it as numpy arrays. You can see caffe_translator.py or utils.py in caffe2/python(NumpyArrayToCaffe2Tensor, Caffe2TensorToNumpyArray, CaffeBlobToNumpyArray).

And yes I made a new prototxt file manually from the generated pbtxt but I will write a script later.

name: "detectron_keypoint_net"

input: "data" input_dim: 1 input_dim: 3 input_dim: 224 input_dim: 224

layer { bottom: "data" top: "conv1" name: "conv1" type: "Convolution" convolution_param { num_output: 64 kernel_size: 7 pad: 3 stride: 2# } } layer { bottom: "conv1" top: "conv1" name: "conv1_relu" type: "ReLU" } layer { bottom: "conv1" top: "pool1" name: "pool1" type: "Pooling" pooling_param { kernel_size: 3 stride: 2# pool: MAX } }
layer { bottom: "pool1" top: "res2a_branch1_conv" name: "res2a_branch1" type: "Convolution" param { lr_mult: 0.0 } convolution_param { num_output: 256 kernel_size: 1 pad: 0 stride: 1 bias_term: false } } layer { bottom: "pool1" top: "res2a_branch2a" name: "res2a_branch2a" type: "Convolution" param { lr_mult: 0.0 } convolution_param { num_output: 64 kernel_size: 1 pad: 0 stride: 1 bias_term: false } }
layer { bottom: "res2a_branch2a" top: "res2a_branch2a" name: "res2a_branch2a_relu" type: "ReLU" }

layer { bottom: "res2a_branch2a" top: "res2a_branch2b" name: "res2a_branch2b" type: "Convolution" param { lr_mult: 0.0 } convolution_param { num_output: 64 kernel_size: 3 pad: 1 stride: 1 bias_term: false } }
... ... ... layer { bottom: "conv_fcn8" top: "kps_score_lowres" name: "kps_score_lowres" type: "Deconvolution" convolution_param { num_output: 512 kernel_size: 4 stride: 2 } } layer { bottom: "kps_score_lowres" top: "kps_score" name: "kps_score" type: "Deconvolution" convolution_param { num_output: 17 kernel_size: 4 stride: 2 } }

For what it's worth, this comment seems to show a sort of solution here but it does not look like there is a direct and obvious way. caffe2/caffe2#641 (comment)

Yes. ONNX is such a awesome tool, but when I tried to convert a tensorflow model to caffe before, it run slower almost 10 times(both running based on cuDNN layers). So, conceptually, it's possible to convert Caffe2 to Caffe, however, actually there are too many general interfaces inserted. That's why I want to use numpy to directly convert as "convert_pkl_to_pb.py" does.

3. Hyper paramerters

num_output: After loading the model file and I checked the each blob size of total net(keypoint_net, body net) and filled each num_output value in the .prototxt one by one manually. If there is a better way to check the value from pbtxt or else, please tell me. I would like to write a script efficiently.

lr_mult, decay_mult: As I just want to run inference step, I don't need those.

4. Newly added layers

IMO, for running Detectron on Caffe, the newly added operators also need to be ported. That is, even though I converted Detectron model to Caffe model, without below operators, I can't use it! So, I've ported below codes to Caffe~! and now I'm debugging the results.

ResizeNearst(upsample_nearest_op.cu) -> Done! GenerateProposals(generate_proposals_op.cc) -> there is already python/c++ proposal layer in py-rfcn-caffe. CollectAndDistributeFpnRpnProposals(collect_and_distribute_fpn_rpn_proposals_op.cc) --> Done! BatchPermutation(batch_permutation_op.cu)-> Done! BBoxTransform(bbox_transform_op.cc)-> Done! BoxWithNMSLimit(box_with_nms_limit_op.cc)-> Done! ConvTranspose(conv_transpose_op_cudnn.cc) -> thanksfully, someone did it (https://github.com/BVLC/caffe/pull/5924).

gadcam commented 6 years ago

@dedoogong

Regarding 1. Get a Caffe2 model

convert_pkl_to_pb.py --cfg /home/lee/detectron/configs/12_2017_baselines/e2e_keypoint_rcnn_R-50-FPN_s1x.yaml --out_dir . --device cpu

I just tried it and it worked for me. I used a fresh install from current version of the repository + #449 (+ #110 Python3 compatibility fix).

The output looks like this for me

 89 214 992 detectron_keypoint_net_init.pb
      3 829 detectron_keypoint_net.pb
      9 298 detectron_keypoint_net.pbtxt
    430 333 detectron_keypoint_net.png
206 638 747 detectron_net_init.pb
     26 954 detectron_net.pb
     62 998 detectron_net.pbtxt
  3 082 557 detectron_net.png

dedoogong commented 6 years ago

ok ;) thank you for your help!! Yes there was my foolish mistake. now it works. I've converted both caffe2's 2 models and 6 operators into 1 merged caffe model and corresponding 6 layers, plus manually made prototxt. I had to replace the existing caffe_enforce_xxx logging function to caffe's LOG or CHECK to avoid some errors. Now I can run keypoint rcnn on Caffe ;)

yyuzhongpv commented 5 years ago

@dedoogong Have to converted Caffe 2 model to Caffe model? I am working on Faster RCNN and want to convert it to caffe model? any suggestion?

Thanks!

facebookresearch / Detectron