carlren / gSLICr

gSLICr: Real-time super-pixel segmentation
Other
337 stars 126 forks source link

GPU Memory Leak #17

Open meder411 opened 7 years ago

meder411 commented 7 years ago

There appears to be a memory leak on the GPU using the code you've provided. After processing about 2000 images, my 12GB Titan X ran out of memory. I was able to watch it fill up with sequential nvidia-smi calls too. It looks like the image memory is both duplicated is never destroyed on the GPU, but I'm not quite sure where in the code that ought to occur.

My C++ segmenting function is below. As it's the only GPU code, your seg_engine ought to be what's causing the leak, but I'm not sure why that's happening. As you can see at the end of my function, I am explicitly calling the destructor on the core_engine which destroys the seg_engine as well.


void ImageSLICSegmenter::segmentImage(const std::string &input_path, const std::string &output_path)
{
    cv::Mat old_frame = cv::imread(input_path);

    if (!old_frame.data){ EXCEPTION_THROWER(Util::Exception::IOException, "Error loading image") }

    // Set image size from loaded image (scaling appropriately)
    _settings.img_size.x = (int)(_scale * old_frame.cols);
    _settings.img_size.y = (int)(_scale * old_frame.rows);

    // Instantiate a core_engine
    gSLICr::engines::core_engine* gSLICr_engine = new gSLICr::engines::core_engine(_settings);

    // gSLICr takes gSLICr::UChar4Image as input and output
    gSLICr::UChar4Image* in_img = new gSLICr::UChar4Image(_settings.img_size, true, true);
    gSLICr::UChar4Image* out_img = new gSLICr::UChar4Image(_settings.img_size, true, true);

    cv::Size s(_settings.img_size.x, _settings.img_size.y);

    cv::Mat frame;
    cv::resize(old_frame, frame, s);

    load_image(frame, in_img);

    cv::Mat boundry_draw_frame;
    boundry_draw_frame.create(s, CV_8UC3);

    StopWatchInterface *my_timer;
    sdkCreateTimer(&my_timer);
    sdkResetTimer(&my_timer);
       sdkStartTimer(&my_timer);

    gSLICr_engine->Process_Frame(in_img);

    sdkStopTimer(&my_timer);
    std::cout<<"\rsegmentation in:["<<sdkGetTimerValue(&my_timer)<<"]ms" << std::flush;

    gSLICr_engine->Draw_Segmentation_Result(out_img);

    load_image(out_img, boundry_draw_frame);

    std::string fname = Util::Files::getFilenameFromPath(input_path);
    std::string full_out_path = Util::Files::joinPathAndFile(output_path, fname);
    std::string pgm_out_name = full_out_path + ".slic.pgm";
    std::string viz_out_name = full_out_path + ".viz.png";

    gSLICr_engine->Write_Seg_Res_To_PGM(pgm_out_name.c_str());
    bool success = cv::imwrite(viz_out_name, boundry_draw_frame);
    if (!success) { EXCEPTION_THROWER(Util::Exception::IOException, "Error writing image") }

    out_img->~Image();
    in_img->~Image();
    gSLICr_engine->~core_engine();
}
meder411 commented 7 years ago

Adding tmp_idx_img->Free() after line 87 in gSLICr_seg_engine_GPU.cu resolves the issue.

It's also worth noting that this only arises if you are allocating new input and output images to the GPU in successive calls. That is, these lines in my code:

// Instantiate a core_engine
gSLICr::engines::core_engine* gSLICr_engine = new gSLICr::engines::core_engine(_settings);

// gSLICr takes gSLICr::UChar4Image as input and output
gSLICr::UChar4Image* in_img = new gSLICr::UChar4Image(_settings.img_size, true, true);
gSLICr::UChar4Image* out_img = new gSLICr::UChar4Image(_settings.img_size, true, true);

For example, when I wrote a video SLIC-segmenting script (where the frames were all the same resolution), this was a non-issue as I only have to allocate once. However, if I recurse through a directory of images that may have different resolutions, I need to allocate whenever the image size has changed. In those circumstances, the memory leak appears.

aleozlx commented 4 years ago

On a related note, there is a memory leak when the seg_engine is being repeatedly created and destroyed. I was wrong about the location, will update if/when I find out.

--Edit I was gonna post a quick PR about tmp_idx_img, just so people are aware, only to find out there is one already. https://github.com/carlren/gSLICr/pull/18

These need to be wrapped into smart pointers for more extensive usage.