large memory used when infer

tensor-tang commented 6 years ago

This is an issue of NLP online service.

When run inference, the memory usage is always kept as about 6G, which is definitely larger than actually needed.

ChinaLiuHao commented 6 years ago

I meet this situation too. In addition, when i use the inference by multi-thread way with "export OPENBLAS_NUM_THREADS=1", the program may end with the "Aborted" error!

tensor-tang commented 6 years ago

@ChinaLiuHao And as an addition, the "Abort" error is randomly encountered, not always appears.

luotao1 commented 6 years ago

The OCR CRNN_CTC service also has a large memory:

tensor-tang commented 6 years ago

https://github.com/PaddlePaddle/Paddle/blob/666c94e3be10c2290eb143fdff208684e9ee34fe/paddle/fluid/memory/detail/buddy_allocator.cc#L188-L192

This should be the reason. Paddle would alloc max chunk size at the first time.

tensor-tang commented 6 years ago

After debugging we can found there is a flag to choose how much memory we would like to use at the first time. Default it would use about 3.2%(1/32) of your total memory.

usage:

your_app --fraction_of_cpu_memory_to_use=0.1 # it would use 3.2% * 0.1 of total

The track back should be like this:

https://github.com/PaddlePaddle/Paddle/blob/666c94e3be10c2290eb143fdff208684e9ee34fe/paddle/fluid/platform/cpu_info.cc#L26-L28

https://github.com/PaddlePaddle/Paddle/blob/666c94e3be10c2290eb143fdff208684e9ee34fe/paddle/fluid/platform/cpu_info.cc#L54-L58

https://github.com/PaddlePaddle/Paddle/blob/666c94e3be10c2290eb143fdff208684e9ee34fe/paddle/fluid/platform/cpu_info.cc#L65-L69

https://github.com/PaddlePaddle/Paddle/blob/666c94e3be10c2290eb143fdff208684e9ee34fe/paddle/fluid/memory/malloc.cc#L32-L36

tensor-tang commented 6 years ago

@ChinaLiuHao About the "Abort" issue, we can open another issue to discuss it. Thanks.

PaddlePaddle / Paddle

large memory used when infer #11185