PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.23k stars 5.58k forks source link

Memory leaks when doing CPU inference with MKLDNN #25506

Closed cryoco closed 4 years ago

cryoco commented 4 years ago

System information Paddle version: 1.8.2 Paddle With CUDA: False OS: Ubuntu 16.04 CPU: 16 Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz Python version: 3.5.2 CUDA version: 9.0.176 cuDNN version: None.None.None Nvidia driver version: None API information: inference configuration

config.disable_gpu()
config.enable_mkldnn()
config.set_cpu_math_library_num_threads(4)

To Reproduce

  1. download models and data
  2. perform inference with mkldnn code to reproduce: test_ocr_mkldnn_mem.txt(rename to test_ocr_mkldnn_mem.py)
  1. perform inference without mkldnn
    • detection
      python3 test_ocr_mkldnn_mem.py --model_file=./ch_det_mv3_db/model --params_file=./ch_det_mv3_db/params --mode=det 
    • recognition
      python3 test_ocr_mkldnn_mem.py --model_file=./ch_rec_mv3_crnn/model --params_file=./ch_rec_mv3_crnn/params --mode=rec

      Memory usage will maintain stable.

The input shapes we use in OCR are dynamic, which might be relevant to this issue.

luotao1 commented 4 years ago

Please use SetMkldnnCacheCapacity interface. https://github.com/PaddlePaddle/Paddle/blob/126d3d693b0b5ebdbef5c6d315bad86b701ebcea/paddle/fluid/inference/api/paddle_analysis_config.h#L348-L355

cryoco commented 4 years ago

Please use SetMkldnnCacheCapacity interface. https://github.com/PaddlePaddle/Paddle/blob/126d3d693b0b5ebdbef5c6d315bad86b701ebcea/paddle/fluid/inference/api/paddle_analysis_config.h#L348-L355

  • It is used in paddle/fluid/inference/tests/api/analyzer_detect_tester.cc, which is dynamic shape as well.
  • This interface only has C++/C API, you may wrap a python API for it.

Tried setting MKLDNN cache capacity to 10/20/100/1024, but still got the memory leak :(

cryoco commented 4 years ago

Added python api set_mkldnn_cache_capacity in PR#25524, which might be useful in debugging and solving this issue.

wojtuss commented 4 years ago

@cryoco I got the same error as in https://github.com/PaddlePaddle/Paddle/issues/25507#issuecomment-658228096. Is another model required here as well?

luotao1 commented 4 years ago

This issue use the same model as #25507. And https://github.com/PaddlePaddle/Paddle/issues/25507#issuecomment-658636558 updates the model link.

wojtuss commented 4 years ago

@luotao1 , @cryoco I took the model from https://github.com/PaddlePaddle/Paddle/issues/25507#issuecomment-658636558 and got the following error with the script test_ocr_mkldnn_mem.py (with or without the --mkldnn option; this error does not occur when running the script test_ocr_mkldnn_diff.py from the issue https://github.com/PaddlePaddle/Paddle/issues/25507 on the same model) :

$ python test_ocr_mkldnn_mem.py --model_file=ch_det_mv3_db/model --params_file=ch_det_mv3_db/params
...
I0715 09:04:47.246189 30616 naive_executor.cc:95] ---  skip [feed], feed -> image
I0715 09:04:47.247287 30616 naive_executor.cc:95] ---  skip [concat_1.tmp_0], fetch -> fetch
Traceback (most recent call last):
  File "test_ocr_mkldnn_mem.py", line 88, in <module>
    result = run(pred)
  File "test_ocr_mkldnn_mem.py", line 42, in run
    predictor.zero_copy_run()
paddle.fluid.core_avx.EnforceNotMet:

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0   std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1   paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2   paddle::operators::GetBroadcastDimsArrays(paddle::framework::DDim const&, paddle::framework::DDim const&, int*, int*, int*, int, int)
3   paddle::operators::ElementwiseOp::InferShape(paddle::framework::InferShapeContext*) const
4   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
5   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
6   paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
7   paddle::framework::NaiveExecutor::Run()
8   paddle::AnalysisPredictor::ZeroCopyRun()

------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
  File "/root/miniconda3/envs/ocrpy36/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2525, in append_op
    attrs=kwargs.get("attrs", None))
  File "/root/miniconda3/envs/ocrpy36/lib/python3.6/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
    return self.main_program.current_block().append_op(*args, **kwargs)
  File "/root/miniconda3/envs/ocrpy36/lib/python3.6/site-packages/paddle/fluid/layers/nn.py", line 10348, in _elementwise_op
    'use_mkldnn': use_mkldnn})
  File "/root/miniconda3/envs/ocrpy36/lib/python3.6/site-packages/paddle/fluid/layers/nn.py", line 10521, in elementwise_add
    return _elementwise_op(LayerHelper('elementwise_add', **locals()))
  File "/paddle/PaddleOCR/github/PaddleOCR/ppocr/modeling/heads/det_db_head.py", line 159, in __call__
    input=out4, scale=2), y=in3)  # 1/8
  File "/paddle/PaddleOCR/github/PaddleOCR/ppocr/modeling/architectures/det_model.py", line 112, in __call__
    predicts = self.head(conv_feas)
  File "/paddle/PaddleOCR/github/PaddleOCR/tools/program.py", line 193, in build_export
    image, outputs = model(mode='export')
  File "tools/export_model.py", line 67, in main
    config, eval_program, startup_prog)
  File "tools/export_model.py", line 93, in <module>
    main()

----------------------
Error Message Summary:
----------------------
InvalidArgumentError: Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 96, 4, 56] and the shape of Y = [1, 96, 4, 55]. Received [56] in X is not equal to [55] in Y at i:3.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] at (/paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:157)
  [operator < elementwise_add > error]

Also, there is no --mode option available for the test_ocr_mkldnn_mem.py script.

cryoco commented 4 years ago

@wojtuss My apologize. detection model recognition model code: test_ocr_mkldnn_mem.txt

wojtuss commented 4 years ago

@cryoco Thank you! I reproduced the result and can confirm that MKLDNN cache is growing during the test execution. That is because in the models inputs have variable size and MKLDNN operators with different input size are being added to the cache over and over.

The cache grows less and less over time, however, setting limits to the cache size could also be helpful. Can you confirm that the PR https://github.com/PaddlePaddle/Paddle/pull/25524 helps here?

luotao1 commented 4 years ago

I know the reason:

@cryoco