Closed dedoogong closed 6 years ago
@dedoogong First I have to say I do not know a lot of things about this issue. For me your problem should be decomposed in these two steps
Not only that, Detectron's convert_pkl_to_pb.py converts the example weight file (.pkl, arond 480MB) to so small 2 .pb files(less than 200KB in total) with fusing AffineTransform option. If I apply this fusing option, it merge some blobs, so the original network structure is changed and reduced. As there is no AffineTransform layer in Caffe, I need to turn on this option.
Could you provide the exact command you you run to convert the pkl ? This seems very suspicious to me, and if something is wrong I want to correct my PR.
Also, I found it's possible to convert all of Caffe2's weight/bias to numpy arrays and then those to Caffe using FetchBlob("weight name").
Can you provide a sample code on how to do that ? How do you keep the structure of the net ? Is it the prototxt file ? Is it sufficient ?
For what it's worth, this comment seems to show a sort of solution here but it does not look like there is a direct and obvious way. https://github.com/caffe2/caffe2/issues/641#issuecomment-406715565
@gadcam thank you for your detailed reply.
Current Status:
1. Get a Caffe2 model
Could you provide the exact command you you run to convert the pkl ?
my command is :
convert_pkl_to_pb.py --cfg /home/lee/detectron/configs/12_2017_baselines/e2e_keypoint_rcnn_R-50-FPN_s1x.yaml --out_dir . --device cpu
TEST.WEIGHTS in the yaml file is referencing "/home/lee/detectron/model_final.pkl(480MB)"
results: detectron_keypoint_net.pb(3.9kB) detectron_keypoint_net_init(29byte) detectron_net.pb(27.1kB) detectron_net_init.pb(20byte)
Your PR looks good, maybe there is my mistake. By the way, when I downloaded some demo example pb files here, those are also so small like that. Maybe the small size is a nomal result?
2. 3rd converting tools
Can you provide a sample code on how to do that ? How do you keep the structure of the net ? Is it the prototxt file ? Is it sufficient ?
I just meant that I can access each weights of layers and handle it as numpy arrays. You can see caffe_translator.py or utils.py in caffe2/python(NumpyArrayToCaffe2Tensor, Caffe2TensorToNumpyArray, CaffeBlobToNumpyArray).
And yes I made a new prototxt file manually from the generated pbtxt but I will write a script later.
name: "detectron_keypoint_net"
input: "data" input_dim: 1 input_dim: 3 input_dim: 224 input_dim: 224
layer {
bottom: "data"
top: "conv1"
name: "conv1"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 7
pad: 3
stride: 2#
}
}
layer {
bottom: "conv1"
top: "conv1"
name: "conv1_relu"
type: "ReLU"
}
layer {
bottom: "conv1"
top: "pool1"
name: "pool1"
type: "Pooling"
pooling_param {
kernel_size: 3
stride: 2#
pool: MAX
}
}
layer {
bottom: "pool1"
top: "res2a_branch1_conv"
name: "res2a_branch1"
type: "Convolution"
param {
lr_mult: 0.0
}
convolution_param {
num_output: 256
kernel_size: 1
pad: 0
stride: 1
bias_term: false
}
}
layer {
bottom: "pool1"
top: "res2a_branch2a"
name: "res2a_branch2a"
type: "Convolution"
param {
lr_mult: 0.0
}
convolution_param {
num_output: 64
kernel_size: 1
pad: 0
stride: 1
bias_term: false
}
}
layer {
bottom: "res2a_branch2a"
top: "res2a_branch2a"
name: "res2a_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res2a_branch2a"
top: "res2a_branch2b"
name: "res2a_branch2b"
type: "Convolution"
param {
lr_mult: 0.0
}
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
stride: 1
bias_term: false
}
}
...
...
...
layer {
bottom: "conv_fcn8"
top: "kps_score_lowres"
name: "kps_score_lowres"
type: "Deconvolution"
convolution_param {
num_output: 512
kernel_size: 4
stride: 2
}
}
layer {
bottom: "kps_score_lowres"
top: "kps_score"
name: "kps_score"
type: "Deconvolution"
convolution_param {
num_output: 17
kernel_size: 4
stride: 2
}
}
For what it's worth, this comment seems to show a sort of solution here but it does not look like there is a direct and obvious way. caffe2/caffe2#641 (comment)
Yes. ONNX is such a awesome tool, but when I tried to convert a tensorflow model to caffe before, it run slower almost 10 times(both running based on cuDNN layers). So, conceptually, it's possible to convert Caffe2 to Caffe, however, actually there are too many general interfaces inserted. That's why I want to use numpy to directly convert as "convert_pkl_to_pb.py" does.
3. Hyper paramerters
num_output: After loading the model file and I checked the each blob size of total net(keypoint_net, body net) and filled each num_output value in the .prototxt one by one manually. If there is a better way to check the value from pbtxt or else, please tell me. I would like to write a script efficiently.
lr_mult, decay_mult: As I just want to run inference step, I don't need those.
4. Newly added layers
IMO, for running Detectron on Caffe, the newly added operators also need to be ported. That is, even though I converted Detectron model to Caffe model, without below operators, I can't use it! So, I've ported below codes to Caffe~! and now I'm debugging the results.
ResizeNearst(upsample_nearest_op.cu) -> Done! GenerateProposals(generate_proposals_op.cc) -> there is already python/c++ proposal layer in py-rfcn-caffe. CollectAndDistributeFpnRpnProposals(collect_and_distribute_fpn_rpn_proposals_op.cc) --> Done! BatchPermutation(batch_permutation_op.cu)-> Done! BBoxTransform(bbox_transform_op.cc)-> Done! BoxWithNMSLimit(box_with_nms_limit_op.cc)-> Done! ConvTranspose(conv_transpose_op_cudnn.cc) -> thanksfully, someone did it (https://github.com/BVLC/caffe/pull/5924).
@dedoogong
Regarding 1. Get a Caffe2 model
convert_pkl_to_pb.py --cfg /home/lee/detectron/configs/12_2017_baselines/e2e_keypoint_rcnn_R-50-FPN_s1x.yaml --out_dir . --device cpu
I just tried it and it worked for me. I used a fresh install from current version of the repository + #449 (+ #110 Python3 compatibility fix).
The output looks like this for me
89 214 992 detectron_keypoint_net_init.pb
3 829 detectron_keypoint_net.pb
9 298 detectron_keypoint_net.pbtxt
430 333 detectron_keypoint_net.png
206 638 747 detectron_net_init.pb
26 954 detectron_net.pb
62 998 detectron_net.pbtxt
3 082 557 detectron_net.png
ok ;) thank you for your help!! Yes there was my foolish mistake. now it works. I've converted both caffe2's 2 models and 6 operators into 1 merged caffe model and corresponding 6 layers, plus manually made prototxt. I had to replace the existing caffe_enforce_xxx logging function to caffe's LOG or CHECK to avoid some errors. Now I can run keypoint rcnn on Caffe ;)
@dedoogong Have to converted Caffe 2 model to Caffe model? I am working on Faster RCNN and want to convert it to caffe model? any suggestion?
Thanks!
I want to run Keypoint RCNN on an embedded board(not TX1 or 2, but another powerful DSP board that can run RFCN or Faster RCNN model in real time) after model quantization and some optimization. Unfortunately it supports only Caffe 1.
After merging PR #372 and #449, I'm struggling to convert Keypoint-R-CNN example model to Caffe. By default, Detectron divides the model into "base" part and "head" part, but I made a new Caffe prototxt including both keypoint net and net(base net) after extracting detectron_net.pbtxt and detectron_keypoint_net_pbtxt together and manually changed each name of items.
Also, I found it's possible to convert all of Caffe2's weight/bias to numpy arrays and then those to Caffe using FetchBlob("weight name").
But now, I feel confused that it seems like, there would be no matching CUDA/cuDNN implementation in Caffe2 for Caffe's "Deconvolution(gpu)" code( Does "ConvTranspose" work same as "Deconvolution"?) Should I port each newly implemented gpu operators in Caffe2 to Caffe's layer? BatchPermutation gpu operator in Caffe2 also doesn't exist in Caffe. Oh, I don't want to open Hell Gate.
Plus, some of Caffe's hyper parameters in prototxt are gone in Caffe2's pbtxt such as " num_output" or "lr_mult" in convolution_param. How can I infer num_output values? or how can I apply lr_mult /decay_mult? Can I just ignore those old version parameters?
Not only that, Detectron's convert_pkl_to_pb.py converts the example weight file(.pkl, arond 480MB) to so small 2 .pb files(less than 200KB in total) with fusing AffineTransform option. If I apply this fusing option, it merge some blobs, so the original network structure is changed and reduced. As there is no AffineTransform layer in Caffe, I need to turn on this option.
So, I would like someone to tell me that Caffe2 can be translated to Caffe or not. Is there anybody who succeed in converting Caffe2 model to Caffe? I'm also considering ONNX or MMdnn, but it seems like, those converting tools insert too many interface layers that make it run so slowly! So, I want to convert directly from Detectron/Caffe2 to Caffe using numpy!
Thank you in advance!
System information
Operating system: ubuntu 16.04
Compiler version: gcc 5.4.0
CUDA version: 9.2
cuDNN version: 7.1.4
NVIDIA driver version: 396.45
GPU models (for all devices if they are not all the same): GTX 1080
PYTHONPATH
environment variable: /home/lee/caffe2-pytorch/build:/usr/local:/usr/local/lib/python3.5/site-packages:/usr/local/lib/python3.5/dist-packages:/home/lee/Downloads/light_head_rcnn/lib:/home/lee/Downloads/pyRFCN/py-R-FCN/lib:/home/lee/Downloads/pyRFCN/py-R-FCN/caffe/python:python --version
output: python3.5