serizba / cppflow

Run TensorFlow models in C++ without installation and without Bazel
https://serizba.github.io/cppflow/
MIT License
787 stars 179 forks source link

Tensor memory release #154

Closed GlockPL closed 2 years ago

GlockPL commented 3 years ago

Hi, I'm writing a DLL that is loaded into certain software which creates drag and drop blocks that than can be used as additional functionality in software. I wrote block that load model and a block that performs prediction. Everything works fine to a point. When a try to close the software it crashes which tells me that memory is not released properly. Problem starts to show already after I load vector into tensor and the perform transpose. Is there a way to force the tensor to reset or free up it’s memory?

ljn917 commented 3 years ago

Is it something related to #131 ?

GlockPL commented 3 years ago

Ok, so I tried wrapping tensor in unique_pointer but the moment I want to perform some action on it I get this compile error: Error C2280 'std::unique_ptr<cppflow::tensor,std::default_delete<cppflow::tensor>>::unique_ptr(const std::unique_ptr<cppflow::tensor,std::default_delete<cppflow::tensor>> &)': attempting to reference a deleted function (compiling source file TFHSISegment.cpp) CppFlow D:\libs\cppflow\include\cppflow\tensor.h 149

I also tried using cppflow::tensor* input = new cppflow::tensor; and the perform operations like *input = cppflow::transpose(*input, { 1,2,0 }); ending with delete input; but there's still some memory that is not released.

ljn917 commented 3 years ago

I cannot give further suggestions without the actual code. Please refer to the code in the examples directory.

GlockPL commented 3 years ago

Ok so here is my code, part of the DLL that is loaded to the software It works fine, it just blocks the software from shuting down, like something is stuck.

    atl::Array<avl::Image> inImageArray;
    atl::Array<avl::Pixel> inColors;
    ReadInput(L"inImageArray", inImageArray);
    ReadInput(L"inColors", inColors);

    int width = inImageArray[0].Width();
    int height = inImageArray[0].Height();
    int pitch = inImageArray[0].Pitch();        
    int channels = inImageArray.Size();
    int bitwidth = inImageArray[0].PixelSize();
    int no_pixels = width * height * channels;
    int temp_shift = static_cast<int>(pitch / bitwidth);

    std::vector<uint16_t> v;
    std::vector<int64_t> shape{ channels , height, width };
    //This copies image to 1d vector
    for (int c = 0; c < channels; c++) {
        const uint16_t* tmp_data = inImageArray[c].Ptr<uint16_t>(avl::Location(0, 0));
        for (int y = 0; y < height; y++) {
            std::copy_n(tmp_data, width, std::back_inserter(v));                
            tmp_data += temp_shift;
        }
    }

    //I'm creating input tensor with the vector and the shape of the data
     cppflow::tensor input = cppflow::tensor(v, shape);
   //I have to perform few operations in order to get proper input data for model
    input = cppflow::transpose(input, { 1,2,0 });
    input = cppflow::expand_dims(input, 0);
    input = cppflow::expand_dims(input, 4);
    input = cppflow::cast(input, TF_UINT16, TF_FLOAT);

    //Loading singleton model
    cppflow::model model = CppFlow::ModelState::getInstance().getModel();

    cppflow::tensor output = model({ {"serving_default_hsi:0", input} }, { "StatefulPartitionedCall:0" })[0];        
    output = cppflow::arg_max(output, 3);
    output = cppflow::squeeze(output, {0});

    outImage = avl::Image(width, height, avl::PlainType::UInt8, 3, atl::NIL);

    auto values = output.get_data<int64_t>();
    //This creates color image from data from the network
    for (int w = 0; w < width; ++w) {
        for (int h = 0; h < height; ++h) {
            int data = values[width * h + w];
            uint8_t* pixel = outImage.Ptr<uint8_t>(w, h);
            *pixel = (uint8_t)inColors[data].X();
            *(pixel + 1) = (uint8_t)inColors[data].Y();
            *(pixel + 2) = (uint8_t)inColors[data].Z();
        }
    }       

    v.clear();
    WriteOutput(L"outImage", outImage);
    return avs::INVOKE_NORMAL;
ljn917 commented 3 years ago

As #131 mentioned, loading models in a singleton in a dynamic library causes initialization order issues. The problem is that model deconstructor uses the global status (context::get_status()), and the status is released before the model, leading to null pointer dereference.

At the moment, there are two possible solutions (untested): 1) call get_global_context() get_status() before loading any model, or 2) change the model class to allocate a per model status in its constructor.

ljn917 commented 3 years ago

132 was an attempt to fix this, but memory allocation in deconstructor is not a good practice in general.

GlockPL commented 3 years ago

I don't think this is a model problem, yet. If you reduce the code just to this:

atl::Array<avl::Image> inImageArray;
atl::Array<avl::Pixel> inColors;
ReadInput(L"inImageArray", inImageArray);
ReadInput(L"inColors", inColors);

int width = inImageArray[0].Width();
int height = inImageArray[0].Height();
int pitch = inImageArray[0].Pitch();        
int channels = inImageArray.Size();
int bitwidth = inImageArray[0].PixelSize();
int no_pixels = width * height * channels;
int temp_shift = static_cast<int>(pitch / bitwidth);

std::vector<uint16_t> v;
std::vector<int64_t> shape{ channels , height, width };
//This copies image to 1d vector
for (int c = 0; c < channels; c++) {
    const uint16_t* tmp_data = inImageArray[c].Ptr<uint16_t>(avl::Location(0, 0));
    for (int y = 0; y < height; y++) {
        std::copy_n(tmp_data, width, std::back_inserter(v));                
        tmp_data += temp_shift;
    }
}            

//I'm creating input tensor with the vector and the shape of the data
 cppflow::tensor input = cppflow::tensor(v, shape);
//I have to perform few operations in order to get proper input data for model
input = cppflow::transpose(input, { 1,2,0 });

WriteOutput(L"outImage", outImage);
return avs::INVOKE_NORMAL;

I already get the software freezing. So there maybe similiar problem in tensor somewhere.

ljn917 commented 3 years ago

I did not see any issue from the code you posted. I suggest isolating a minimum runnable example that reproduces the problems. As mentioned before, I also recommend testing the examples in this repo.

GlockPL commented 3 years ago

I tested load model example and it also freezes during exit of the software

ljn917 commented 3 years ago

I cannot reproduce the issue. Are you running the example produced by https://github.com/serizba/cppflow/blob/master/examples/load_model/main.cpp?

GlockPL commented 3 years ago

I copied the example into minimal code needed to run it as dll inside Adaptive Vision Software.

ljn917 commented 3 years ago

I think you have the exact problem reported in #131. Try #132.

GlockPL commented 3 years ago

I don't think this is the issue. Because the software freezes already after first operation on tensor. Besides that I changed session deleter according to #131 and it still doesn't work

ljn917 commented 3 years ago

I am afraid I cannot provide more help unless I see the full code. You could also attach a debugger and see where it freezes.

On Wed, Sep 29, 2021, 06:17 GlockPL @.***> wrote:

I don't think this is the issue. Because the software freezes already after first operation on tensor. Besides that I changed session deleter according to #131 https://github.com/serizba/cppflow/issues/131 and it still doesn't work

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/serizba/cppflow/issues/154#issuecomment-930043035, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALDTHUFE5C7LCN63MBRPQTUELRTZANCNFSM5EWJVS6Q .

serizba commented 2 years ago

@GlockPL

Can you check if the last changes fixed your error?

serizba commented 2 years ago

Error should be fixed with #201

Closing due to inactivity.