Closed cryoco closed 4 years ago
Please use SetMkldnnCacheCapacity
interface.
https://github.com/PaddlePaddle/Paddle/blob/126d3d693b0b5ebdbef5c6d315bad86b701ebcea/paddle/fluid/inference/api/paddle_analysis_config.h#L348-L355
paddle/fluid/inference/tests/api/analyzer_detect_tester.cc
, which is dynamic shape as well.Please use
SetMkldnnCacheCapacity
interface. https://github.com/PaddlePaddle/Paddle/blob/126d3d693b0b5ebdbef5c6d315bad86b701ebcea/paddle/fluid/inference/api/paddle_analysis_config.h#L348-L355
- It is used in
paddle/fluid/inference/tests/api/analyzer_detect_tester.cc
, which is dynamic shape as well.- This interface only has C++/C API, you may wrap a python API for it.
Tried setting MKLDNN cache capacity to 10/20/100/1024, but still got the memory leak :(
Added python api set_mkldnn_cache_capacity in PR#25524, which might be useful in debugging and solving this issue.
@cryoco I got the same error as in https://github.com/PaddlePaddle/Paddle/issues/25507#issuecomment-658228096. Is another model required here as well?
This issue use the same model as #25507. And https://github.com/PaddlePaddle/Paddle/issues/25507#issuecomment-658636558 updates the model link.
@luotao1 , @cryoco
I took the model from https://github.com/PaddlePaddle/Paddle/issues/25507#issuecomment-658636558 and got the following error with the script test_ocr_mkldnn_mem.py
(with or without the --mkldnn
option; this error does not occur when running the script test_ocr_mkldnn_diff.py
from the issue https://github.com/PaddlePaddle/Paddle/issues/25507 on the same model) :
$ python test_ocr_mkldnn_mem.py --model_file=ch_det_mv3_db/model --params_file=ch_det_mv3_db/params
...
I0715 09:04:47.246189 30616 naive_executor.cc:95] --- skip [feed], feed -> image
I0715 09:04:47.247287 30616 naive_executor.cc:95] --- skip [concat_1.tmp_0], fetch -> fetch
Traceback (most recent call last):
File "test_ocr_mkldnn_mem.py", line 88, in <module>
result = run(pred)
File "test_ocr_mkldnn_mem.py", line 42, in run
predictor.zero_copy_run()
paddle.fluid.core_avx.EnforceNotMet:
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::GetBroadcastDimsArrays(paddle::framework::DDim const&, paddle::framework::DDim const&, int*, int*, int*, int, int)
3 paddle::operators::ElementwiseOp::InferShape(paddle::framework::InferShapeContext*) const
4 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
5 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
6 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
7 paddle::framework::NaiveExecutor::Run()
8 paddle::AnalysisPredictor::ZeroCopyRun()
------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
File "/root/miniconda3/envs/ocrpy36/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "/root/miniconda3/envs/ocrpy36/lib/python3.6/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/root/miniconda3/envs/ocrpy36/lib/python3.6/site-packages/paddle/fluid/layers/nn.py", line 10348, in _elementwise_op
'use_mkldnn': use_mkldnn})
File "/root/miniconda3/envs/ocrpy36/lib/python3.6/site-packages/paddle/fluid/layers/nn.py", line 10521, in elementwise_add
return _elementwise_op(LayerHelper('elementwise_add', **locals()))
File "/paddle/PaddleOCR/github/PaddleOCR/ppocr/modeling/heads/det_db_head.py", line 159, in __call__
input=out4, scale=2), y=in3) # 1/8
File "/paddle/PaddleOCR/github/PaddleOCR/ppocr/modeling/architectures/det_model.py", line 112, in __call__
predicts = self.head(conv_feas)
File "/paddle/PaddleOCR/github/PaddleOCR/tools/program.py", line 193, in build_export
image, outputs = model(mode='export')
File "tools/export_model.py", line 67, in main
config, eval_program, startup_prog)
File "tools/export_model.py", line 93, in <module>
main()
----------------------
Error Message Summary:
----------------------
InvalidArgumentError: Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 96, 4, 56] and the shape of Y = [1, 96, 4, 55]. Received [56] in X is not equal to [55] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] at (/paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:157)
[operator < elementwise_add > error]
Also, there is no --mode
option available for the test_ocr_mkldnn_mem.py
script.
@wojtuss My apologize. detection model recognition model code: test_ocr_mkldnn_mem.txt
@cryoco Thank you! I reproduced the result and can confirm that MKLDNN cache is growing during the test execution. That is because in the models inputs have variable size and MKLDNN operators with different input size are being added to the cache over and over.
The cache grows less and less over time, however, setting limits to the cache size could also be helpful. Can you confirm that the PR https://github.com/PaddlePaddle/Paddle/pull/25524 helps here?
I know the reason:
predictor.run(inputs, outputs, batch_size)
before, since it needs to know the input's size. AnalysisPredictor::MkldnnPreSet()
is used only in AnalysisPredictor::Run(inputs, outputs, batch_size)
.predictor.zero_copy_run()
@cryoco
predictor.run(inputs, outputs, batch_size)
? predictor.zero_copy_run()
support limits to the cache size as well? In this scenario, maybe we need to wrapper some new interface.
System information Paddle version: 1.8.2 Paddle With CUDA: False OS: Ubuntu 16.04 CPU: 16 Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz Python version: 3.5.2 CUDA version: 9.0.176 cuDNN version: None.None.None Nvidia driver version: None API information: inference configuration
To Reproduce
np.save
, and can be loaded withnp.load
.Memory usage increasing can be witnessed with
top
command.Memory usage will maintain stable.
The input shapes we use in OCR are dynamic, which might be relevant to this issue.