rockchip-linux / mpp

Media Process Platform (MPP) module
467 stars 156 forks source link

mpp存在的内存泄漏情况 #542

Closed xb985547608 closed 3 months ago

xb985547608 commented 3 months ago

运行平台是rk3568,程序内有个功能是将h265实时转码为h264,程序在初期运行时很稳定,内存也稳定在一个区间,大概200MB以内,随着时间的进行,会出现不可预料的oom的错误,可能一两个小时出现,可能一晚上才会出现,下面是我截取的系统OOM的信息 ` [ 3464.958740] mali fde60000.gpu: OOM notifier: dev mali0 10132 kB [ 3464.958789] mali fde60000.gpu: OOM notifier: tsk weston tgid (524) pid (524) 10132 kB [ 3464.958941] mpp_h264e_1896 invoked oom-killer: gfp_mask=0x60c0c0(GFP_KERNEL|__GFP_COMP|GFP_ZERO), nodemask=(null), order=2, oom_score_adj=0 [ 3464.958949] COMPACTION is disabled!!! [ 3464.958956] mpp_h264e_1896 cpuset=/ mems_allowed=0 [ 3464.958972] CPU: 1 PID: 2458 Comm: mpp_h264e_1896 Not tainted 4.19.232 #9 [ 3464.958977] Hardware name: Rockchip RK3568 EVB2 LP4X V10 Board (DT) [ 3464.958984] Call trace: [ 3464.958998] dump_backtrace+0x0/0x188 [ 3464.959007] show_stack+0x28/0x34 [ 3464.959016] dump_stack+0x90/0xb8 [ 3464.959026] dump_header.constprop.0+0x74/0x22c [ 3464.959035] oom_kill_process+0xc0/0x3f8 [ 3464.959042] out_of_memory+0x340/0x370 [ 3464.959049] alloc_pages_nodemask+0x964/0xa88 [ 3464.959056] kmalloc_order+0x34/0x54 [ 3464.959062] kmalloc_order_trace+0x40/0xe0 [ 3464.959071] rkvenc_alloc_task+0x70/0x3b4 [ 3464.959079] mpp_process_task_default+0xa8/0x1f4 [ 3464.959085] mpp_dev_ioctl+0x240/0x3e0 [ 3464.959095] vfs_ioctl+0x5c/0x6c [ 3464.959101] do_vfs_ioctl+0xc0/0x9f8 [ 3464.959107] ksys_ioctl+0x54/0x84 [ 3464.959112] __arm64_sys_ioctl+0x2c/0x3c [ 3464.959121] el0_svc_common.constprop.0+0xf0/0x170 [ 3464.959127] el0_svc_handler+0x54/0x90 [ 3464.959134] el0_svc+0x8/0xc [ 3464.959140] Mem-Info: [ 3464.959154] active_anon:113334 inactive_anon:2419 isolated_anon:0 [ 3464.959154] active_file:79057 inactive_file:213324 isolated_file:32 [ 3464.959154] unevictable:21320 dirty:28113 writeback:0 unstable:0 [ 3464.959154] slab_reclaimable:55806 slab_unreclaimable:6633 [ 3464.959154] mapped:35042 shmem:23854 pagetables:942 bounce:0 [ 3464.959154] free:2182 free_pcp:116 free_cma:0 [ 3464.959170] Node 0 active_anon:453336kB inactive_anon:9676kB active_file:316228kB inactive_file:853296kB unevictable:85280kB isolated(anon):0kB isolated(file):128kB mapped:140168kB dirty:112452kB writeback:0kB shmem:95416kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no [ 3464.959182] DMA32 free:8728kB min:5676kB low:7692kB high:9708kB active_anon:453336kB inactive_anon:9676kB active_file:316228kB inactive_file:853940kB unevictable:84924kB writepending:112228kB present:2078720kB managed:2018700kB mlocked:0kB kernel_stack:5664kB pagetables:3768kB bounce:0kB free_pcp:464kB local_pcp:144kB free_cma:0kB [ 3464.959188] lowmem_reserve[]: 0 0 0 [ 3464.959195] DMA32: 7424kB (UMH) 3518kB (UMEH) 3416kB (H) 2332kB (H) 1764kB (H) 4128kB (H) 1256kB (H) 0512kB 01024kB 02048kB 0*4096kB = 8912kB [ 3464.959219] 316275 total pagecache pages [ 3464.959226] 0 pages in swap cache [ 3464.959231] Swap cache stats: add 0, delete 0, find 0/0 [ 3464.959236] Free swap = 0kB [ 3464.959240] Total swap = 0kB [ 3464.959244] 519680 pages RAM [ 3464.959249] 0 pages HighMem/MovableOnly [ 3464.959254] 15005 pages reserved [ 3464.959258] 4096 pages cma reserved

...

[ 3464.959781] Out of memory: Kill process 1896 (fs_mediaserver) score 194 or sacrifice child [ 3464.960318] Killed process 1896 (fs_mediaserver) total-vm:1540680kB, anon-rss:313908kB, file-rss:70868kB, shmem-rss:6344kB [ 3465.009180] oom_reaper: reaped process 1896 (fs_mediaserver), now anon-rss:0kB, file-rss:0kB, shmem-rss:9800kB

`

HermanChen commented 3 months ago

fs_mediaserver 这个进行里的泄漏,可以试试把编解码的数据流操作先变成空操作,空跑流程看看会不会内存泄漏

xb985547608 commented 3 months ago

最终定位是mpp.cpp里的Mpp::put_packet这里面有个内存泄漏,新拷贝的pkt_in好像没被释放,能帮忙看下如何处理这种情况的泄漏吗 ... if (NULL == mpp_packet_get_buffer(packet)) { / packet copy path / MppPacket pkt_in = NULL;

    mpp_packet_copy_init(&pkt_in, packet);
    mpp_packet_set_length(packet, 0);
    pkt_copy = 1;
    packet = pkt_in;
    ret = MPP_OK;
} else {
    /* packet zero copy path */
    mpp_log_f("not support zero copy path\n");
    timeout = MPP_POLL_BLOCK;
}

...

xb985547608 commented 3 months ago

Snipaste_2024-03-14_14-05-33

下面是ffmpeg摘抄的一个便利函数 ` static int rkmpp_write_data(AVDecoderMPPPrivate d, uint8_t buffer, int size, int64_t pts) { int ret; MppPacket packet;

// create the MPP packet
ret = mpp_packet_init(&packet, buffer, size);
if (ret != MPP_OK) {
    LOGMW("mpp_packet_init failed: %d", ret);
    return ret;
}

mpp_packet_set_pts(packet, pts);

if (!buffer)
    mpp_packet_set_eos(packet);

ret = d->mpi->decode_put_packet(d->ctx, packet);
if (ret != MPP_OK && ret != MPP_ERR_BUFFER_FULL)
    LOGMW("decode_put_packet failed: %d", ret);

mpp_packet_deinit(&packet);

return ret;

} `

HermanChen commented 3 months ago

copy_init 的 MppPacket 在内部都会 deint 掉的

xb985547608 commented 3 months ago

能帮我看下为啥rkmpp_write_data这个便利函数存在内存泄漏呢,我的init和deinit成对出现的

HermanChen commented 3 months ago

你这个处理应该也是 ok 的,只是返回 MPP_ERR_BUFFER_FULL 的时候,可以等待一下重新送

xb985547608 commented 3 months ago

现在这个内存泄漏会导致程序内存持续升高,请问有啥解决的办法吗,这应该是个很常用的操作吧,没理由就我出现这种情况,还有就是我的程序引入了tcmalloc,这个对mpp的内存池的管理会有影响吗

xb985547608 commented 3 months ago

单独使用mpp解码没有问题,但是加入mpp的编码就会出现上述的泄漏情况,想零拷贝的去重编码,我把解码出来的frame直接传给encode_put_frame bool AVEncoderMPP::sendFrame(const AVFrame *frame) { MppFrame frm = NULL; if (frame) { frm = reinterpret_cast(frame->opaque); } else { d->frm_eof = true; } d->error = d->mpi->encode_put_frame(d->ctx, frm);

return d->error == MPP_OK;

}

bool AVEncoderMPP::receivePacket() { RK_S32 get_pkt = 0;

if (d->mpp_pkt)
    mpp_packet_deinit(&d->mpp_pkt);

mpp_packet_init_with_buffer(&d->mpp_pkt, d->pkt_buf);
mpp_packet_set_length(d->mpp_pkt, 0);

do {

    d->error = d->mpi->encode_get_packet(d->ctx, &d->mpp_pkt);
    if (MPP_ERR_TIMEOUT == d->error) {
        msleep(1);
        continue;
    }

    if (d->error) {
        LOGMW("encode_get_packet failed: %d", d->error);
        break;
    }

    if (d->mpp_pkt) {
        uint8_t *data = (uint8_t*)mpp_packet_get_pos(d->mpp_pkt);
        size_t size = mpp_packet_get_length(d->mpp_pkt);
        if (data == NULL || size <= 0)
            break;

        av_packet_unref(d->ff_pkt);
        d->error = av_new_packet(d->ff_pkt, size);
        if (d->error == MPP_OK) {
            memcpy(d->ff_pkt->data, data, size);
            d->ff_pkt->size = size;
            d->ff_pkt->pts = mpp_packet_get_pts(d->mpp_pkt);
            d->ff_pkt->dts = mpp_packet_get_dts(d->mpp_pkt);
            if (d->ff_pkt->pts <= 0)
                d->ff_pkt->pts = d->ff_pkt->dts;
            if (d->ff_pkt->dts <= 0)
                d->ff_pkt->dts = d->ff_pkt->pts;

            get_pkt = 1;
        }
    }

    break;
} while(true);

return get_pkt;

}

HermanChen commented 3 months ago

去掉 tcmalloc 试试? encode_put_frame 之后要把 MppFrame 释放,encode_get_packet 不需要给外部给 packet 进去,内部自己会分,如果指定编码器的输出 buffer,要把 packet 挂到输入的 MppFrame 的 Meta 里。

xb985547608 commented 3 months ago

没有把packet挂到MppFrame的meta里,抄demo的代码没抄全,感谢感谢,问题解决了

HermanChen commented 3 months ago

好的,问题解决请关闭问题单