intel / caffe

This fork of BVLC/Caffe is dedicated to improving performance of this deep learning framework when running on CPU, in particular Intel® Xeon processors.
Other
849 stars 491 forks source link

@: env.c:158:parse_server_affinity() strlen(server_affinity_to_parse) > 0 failed. #227

Open cheerss opened 6 years ago

cheerss commented 6 years ago

I want to test the accuracy loss with 8-bit quantization, I followed the guide from https://github.com/intel/caffe/wiki/Introduction-of-Accuracy-Calibration-Tool-for-8-Bit-Inference to run my model.

I pull the docker image with intel-caffe from the docker hub and test with my deploy model, so the command is

python scripts/calibrator.py -r build -m /caffe_model/deploy.prototxt -w /caffe_model/deploy.caffemodel -i 80000 -n detection_out -l 0.01 -d 1

The model is a detection model of SSD whose backbone is SqueezeNet, I am sure that the model is OK, however, when I run the model, an error occurred as follows:

Sampling...
I0715 06:58:31.841138    66 upgrade_proto.cpp:109] Attempting to upgrade input file specified using deprecated input fields: /caffe_model/deploy.prototxt
I0715 06:58:31.841594    66 upgrade_proto.cpp:112] Successfully upgraded file specified using deprecated input fields.
W0715 06:58:31.841611    66 upgrade_proto.cpp:114] Note that future Caffe releases will only support input layers and not input fields.
I0715 06:58:31.860559    66 cpu_info.cpp:453] Processor speed [MHz]: 2200
I0715 06:58:31.860636    66 cpu_info.cpp:456] Total number of sockets: 2
I0715 06:58:31.860659    66 cpu_info.cpp:459] Total number of CPU cores: 24
I0715 06:58:31.860687    66 cpu_info.cpp:462] Total number of processors: 48
I0715 06:58:31.860710    66 cpu_info.cpp:465] GPU is used: no
I0715 06:58:31.860729    66 cpu_info.cpp:468] OpenMP environmental variables are specified: no
I0715 06:58:31.860757    66 cpu_info.cpp:471] OpenMP thread bind allowed: yes
I0715 06:58:31.860786    66 cpu_info.cpp:474] Number of OpenMP threads: 24
I0715 06:58:31.863605    66 net.cpp:743] Dropped layer: relu_conv1
I0715 06:58:31.863646    66 net.cpp:743] Dropped layer: fire2/relu_squeeze1x1
I0715 06:58:31.863672    66 net.cpp:743] Dropped layer: fire2/relu_expand1x1
I0715 06:58:31.863693    66 net.cpp:743] Dropped layer: fire2/relu_expand3x3
I0715 06:58:31.863714    66 net.cpp:743] Dropped layer: fire3/relu_squeeze1x1
I0715 06:58:31.863737    66 net.cpp:743] Dropped layer: fire3/relu_expand1x1
I0715 06:58:31.863757    66 net.cpp:743] Dropped layer: fire3/relu_expand3x3
I0715 06:58:31.863778    66 net.cpp:743] Dropped layer: fire4/relu_squeeze1x1
I0715 06:58:31.863801    66 net.cpp:743] Dropped layer: fire4/relu_expand1x1
I0715 06:58:31.863819    66 net.cpp:743] Dropped layer: fire4/relu_expand3x3
I0715 06:58:31.863840    66 net.cpp:743] Dropped layer: fire5/relu_squeeze1x1
I0715 06:58:31.863859    66 net.cpp:743] Dropped layer: fire5/relu_expand1x1
I0715 06:58:31.863881    66 net.cpp:743] Dropped layer: fire5/relu_expand3x3
I0715 06:58:31.863901    66 net.cpp:743] Dropped layer: fire6/relu_squeeze1x1
I0715 06:58:31.863921    66 net.cpp:743] Dropped layer: fire6/relu_expand1x1
I0715 06:58:31.863940    66 net.cpp:743] Dropped layer: fire6/relu_expand3x3
I0715 06:58:31.863961    66 net.cpp:743] Dropped layer: fire7/relu_squeeze1x1
I0715 06:58:31.863981    66 net.cpp:743] Dropped layer: fire7/relu_expand1x1
I0715 06:58:31.863999    66 net.cpp:743] Dropped layer: fire7/relu_expand3x3
I0715 06:58:31.864020    66 net.cpp:743] Dropped layer: fire8/relu_squeeze1x1
I0715 06:58:31.864040    66 net.cpp:743] Dropped layer: fire8/relu_expand1x1
I0715 06:58:31.864058    66 net.cpp:743] Dropped layer: fire8/relu_expand3x3
I0715 06:58:31.864078    66 net.cpp:743] Dropped layer: fire9/relu_squeeze1x1
I0715 06:58:31.864104    66 net.cpp:743] Dropped layer: fire9/relu_expand1x1
I0715 06:58:31.864154    66 net.cpp:743] Dropped layer: fire9/relu_expand3x3
I0715 06:58:31.864189    66 net.cpp:743] Dropped layer: fire10/relu_squeeze1x1
I0715 06:58:31.864215    66 net.cpp:743] Dropped layer: fire10/relu_expand1x1
I0715 06:58:31.864253    66 net.cpp:743] Dropped layer: fire10/relu_expand3x3
I0715 06:58:31.864279    66 net.cpp:743] Dropped layer: fire11/relu_squeeze1x1
I0715 06:58:31.864301    66 net.cpp:743] Dropped layer: fire11/relu_expand1x1
I0715 06:58:31.864325    66 net.cpp:743] Dropped layer: fire11/relu_expand3x3
I0715 06:58:31.864353    66 net.cpp:743] Dropped layer: conv12_1/relu
I0715 06:58:31.864374    66 net.cpp:743] Dropped layer: conv12_2/relu
I0715 06:58:31.864398    66 net.cpp:743] Dropped layer: conv13_1/relu
I0715 06:58:31.864424    66 net.cpp:743] Dropped layer: conv13_2/relu
Attempting to use an MPI routine before initializing MPI
Sampling done
Generating the FP32 accuracy...
@: env.c:158:parse_server_affinity() strlen(server_affinity_to_parse) > 0 failed.
Failed to get accuracy, please check the parameters and rerun the scripts.

I do not know what "parse_server_affinity" means so cannot find the problem, there is no useful information about the error in Google. By the way, I wonder why should I need provide the program with -i(iterations) parameter, is there any relation between iterations and accuracy loss once the caffe model has been given? and why iterations = epoch / batch_size(as here says), as I know, iterations=epoch*dataset_size/batch_size.

Thank you very much~

cheerss commented 6 years ago

Actually, I have found that intel-caffe should be built without MLSL, which is enabled by default, so the docker image could not be used directly if I want to do quantization. I have modified the Makefile.config and re-built the intel/caffe, it works! However, I still do not know why I should offer -i(iterations) parameter as I said, does anyone knows

guomingz commented 6 years ago

hello @cheerss . the calibration tool need to know iterations for inference as its value depend on the specific topology. So the tool ask the end user to provide the corresponding value.

For the MLSL issue, it had been fixed in our coming release.