osai-ai / tensor-stream

A library for real-time video stream decoding to CUDA memory
GNU Lesser General Public License v2.1
375 stars 44 forks source link

Infinite increase of memory during decoding hls-stream (memleak?) #25

Open Yusupov28 opened 2 years ago

Yusupov28 commented 2 years ago

Hello. I tried to decode hls-stream and saw that memory increases infinitely by 264kb.

Could you explain why it happens?

For reproducing:

I ran Sample.cpp(made the infinite loop). Added this string for hls support after: https://github.com/osai-ai/tensor-stream/blob/master/src/Parser.cpp#L309

av_dict_set(&opts, "protocol_whitelist", "file,udp,rtp,tcp,http,https,tls,crypto", 0); 

And I used python for recording new ts to playlist.

import os, time

if __name__ == "__main__":
    while True:
        #TODO change path
        playlist = "your_path/test_playlist.m3u8"

        with open(playlist, 'r') as hls_file:
            data = hls_file.readlines()

        data.append("#EXTINF:4,\n")
        data.append("https://bitdash-a.akamaihd.net/content/MI201109210084_1/video/1080_4800000/hls/segment_9.ts\n")

        with open(playlist, 'w') as hls_file:
            hls_file.writelines(data)

        del data

        time.sleep(0.1)

Initial test_playlist.m3u8:

#EXTM3U

#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-TARGETDURATION:4

#EXTINF:4.0
https://bitdash-a.akamaihd.net/content/MI201109210084_1/video/1080_4800000/hls/segment_0.ts
#EXTINF:4.
https://bitdash-a.akamaihd.net/content/MI201109210084_1/video/1080_4800000/hls/segment_1.ts
#EXTINF:4.
https://bitdash-a.akamaihd.net/content/MI201109210084_1/video/1080_4800000/hls/segment_2.ts
#EXTINF:4.
https://bitdash-a.akamaihd.net/content/MI201109210084_1/video/1080_4800000/hls/segment_3.ts
#EXTINF:4.
https://bitdash-a.akamaihd.net/content/MI201109210084_1/video/1080_4800000/hls/segment_4.ts
#EXTINF:4.
https://bitdash-a.akamaihd.net/content/MI201109210084_1/video/1080_4800000/hls/segment_5.ts
#EXTINF:4.
https://bitdash-a.akamaihd.net/content/MI201109210084_1/video/1080_4800000/hls/segment_6.ts
#EXTINF:4.
https://bitdash-a.akamaihd.net/content/MI201109210084_1/video/1080_4800000/hls/segment_7.ts
#EXTINF:4.
https://bitdash-a.akamaihd.net/content/MI201109210084_1/video/1080_4800000/hls/segment_8.ts

And I also tried not to call getFrame from WrapperPython.ccp(I ran tensor-stream from python), processingLoop was working and the memory also increased.

BykadorovR commented 2 years ago

Hi @Yusupov28, Can you reproduce memory leak with RTMP/local video? BTW, you don't need to change anything for HLS support.

Yusupov28 commented 2 years ago

I checked with RTSP. Seems like it also has a leak. When I run the memory was 162160 kB. After ~27000 frames it was 162424 kB. And after ~50000 frames - 162688 kB.

With local video the memory increased by 264kb only in the beginning. But after that there was no leak. Video was 32000 frames

BykadorovR commented 2 years ago

Is it peak value and memory goes up/down constantly or you see that memory only grows? I mean it can be some OS memory management/internal FFmpeg buffer's management and not actually memory leak.

Yusupov28 commented 2 years ago

In my cases it only grows. For example I tried to decode hls-stream >24 hours and also the memory only grows.

BykadorovR commented 2 years ago

Which branch do you use? If master could you please try dev one?

Yusupov28 commented 2 years ago

Tried dev branch. The same situation.

BykadorovR commented 2 years ago

In such case please provide some additional information which can help to investigate the issue: OS, HW (CPU/GPU), versions of used libraries (pytorch/ffmpeg/any others), HLS (if possible)

Yusupov28 commented 2 years ago

Ubuntu 16.04, Intel Core i5-4440/NVIDIA GeForce GTX 1060 6GB, Pytorch 1.9.1, ffmpeg 4.2, CUDA 10.1

Yusupov28 commented 2 years ago

I also tried to restart reader several times and the memory(RAM and VRAM) also grows. I changed Sample.cpp a bit.

#include "WrapperC.h"
#include <experimental/filesystem> //C++17 

void get_cycle(FrameParameters frameParameters, std::map<std::string, std::string> executionParameters, TensorStream *reader) {
    try {
        int frames = std::atoi(executionParameters["frames"].c_str());
        if (!frames)
            return;

        int count = 0;

        for (;;) {
            if (frameParameters.color.normalization) {
                auto result = reader->getFrame<float>(executionParameters["name"], std::atoi(executionParameters["delay"].c_str()), frameParameters);
                cudaFree(std::get<0>(result));
            }
            else {
                auto result = reader->getFrame<unsigned char>(executionParameters["name"], std::atoi(executionParameters["delay"].c_str()), frameParameters);
                cudaFree(std::get<0>(result));
            }

            count++;
            if(count==100){
                return;
            }
        }
    }
    catch (...) {
        return;
    }
}

int main()
{
    for (;;) {
        TensorStream* reader = new TensorStream();
        reader->enableLogs(-MEDIUM);
        reader->enableNVTX();
        int sts = VREADER_OK;
        int initNumber = 10;

        while (initNumber--) {
            sts = reader->initPipeline("rtmp://37.228.119.44:1935/vod/big_buck_bunny.mp4", 5, 0, 5);
            if (sts != VREADER_OK)
                reader->endProcessing();
            else
                break;
        }

        reader->skipAnalyzeStage();
        CHECK_STATUS(sts);
        std::thread pipeline([reader] { reader->startProcessing(); });
        int dstWidth = 720;
        int dstHeight = 480;
        std::tuple<int, int> cropTopLeft  { 0, 0 };
        std::tuple<int, int> cropBotRight  { 0, 0 };
        ColorOptions colorOptions = { FourCC::NV12 };
        colorOptions.planesPos = Planes::PLANAR;
        colorOptions.normalization = false;
        ResizeOptions resizeOptions = { dstWidth, dstHeight };
        CropOptions cropOptions = { cropTopLeft, cropBotRight };
        FrameParameters frameParameters = {resizeOptions, colorOptions, cropOptions};

        std::map<std::string, std::string> executionParameters = { {"name", "first"}, {"delay", "0"}, {"frames", "50"}, 
                                                                {"dumpName", std::to_string(std::get<0>(cropBotRight) - std::get<0>(cropTopLeft)) + "x" + std::to_string(std::get<1>(cropBotRight) - std::get<1>(cropTopLeft)) + ".yuv"} };
        std::thread get(get_cycle, frameParameters, executionParameters, reader);
        get.join();
        reader->endProcessing();
        pipeline.join();

        delete reader;
    }
    return 0;
}