openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
7.12k stars 2.23k forks source link

[Bug] Model compilation takes too much memory in cpp #13601

Closed yszhou2019 closed 1 year ago

yszhou2019 commented 1 year ago
System information
Detailed description

I often follow the following procedure to load an ONNX model and inference with a fixed-size tensor. But today when I ran a benchmark(google benchmark) to test the performance of our model, I found model compilation with shape{1, 500, 80}(compiledModel_ = core_.compile_model(model); ) takes too much memory thus leading to OOM and the whole program was killed.

// load and compile the model
    core_.set_property("AUTO", ov::log::level(ov::log::Level::WARNING));
    model_ = core_.read_model(model_path);
    ov::Shape static_shape = {1, static_cast<unsigned long>(mellen), 80};
    model->reshape(static_shape);
    compiledModel_ = core_.compile_model(model); // leads to OOM
    inferRequest_ = compiledModel_.create_infer_request();
// infer
    ov::Tensor input_tensor(ov::element::f32, static_shape, input.data());
    inferRequest_.set_input_tensor(input_tensor);
    inferRequest_.infer();
    auto wav_out = inferRequest_.get_output_tensor(0);

Acturally I don't know how to deal with this problem. Maybe I shouldn't load ONNX model and compile it to IR in cpp?

Often the mel length varies from 100 to 1300. OOM will happene with mel length >= 500.

peterchen-intel commented 1 year ago

@ilyachur One TBB fixing was updated on https://github.com/oneapi-src/oneTBB/commits/tbb_2020 @yszhou2019 You can try to build TBB binaries and replace them inside the OpenVINO folder.

yszhou2019 commented 1 year ago

Thank you! @peterchen-intel I will try it later.

yszhou2019 commented 1 year ago

@peterchen-intel I just compiled the tbb_2020 branch you mentioned and I suppose there is still a memory leak. image

yszhou2019 commented 1 year ago

I wonder how this memory problem can be fixed in later verison.

riverlijunjie commented 1 year ago

@yszhou2019 we did find there is mem leak in tbb, but the different calltrace with you had provided before. image

Per my understand, the above calltrace will not be called multiple times when loop call inference(), so it should not bring too much memory leak. Anyhow we will continue debugging this tbb memory leak to try to find a solution to solve it.

To skip tbb mem leak issue, if possible you can rebuild openvino with option disable tbb (-DTHREADING=SEQ), and run the same test sample to check whether the memory consumption still continuously increasing.

yszhou2019 commented 1 year ago

@riverlijunjie Thank you for your reply! I will try it and reply later.

avitial commented 1 year ago

Closing this, I hope previous responses were sufficient to help you proceed. Feel free to reopen and ask any questions related to this topic.