nyanmisaka / ffmpeg-rockchip

FFmpeg with async and zero-copy Rockchip MPP & RGA support
Other
325 stars 47 forks source link

leaking file descriptors #29

Closed mcerveny closed 3 months ago

mcerveny commented 4 months ago

Hello. I am using code in alloc/free loop.

When I used "scale_rkrga" filter in cycle (avfilter_graph_alloc() ... avfilter_graph_free()) usually leaks 1 FD "anon_inode:sync_file" in /proc/PID/fd (maybe some sort of sync primitive leak). When I used "h264_rkmpp" encoder in cycle (avcodec_alloc_context3() ... avcodec_free_context()) usually leaks 1 FD "/dmabuf:276910-main" in /proc/PID/fd. ("hevc_rkmpp" decoder does not need to restart, because it supports avcodec_flush_buffers()).

I cannot determine if there is problem with rk libraries or ffmpeg integration code. Does anyone have hint/solution to this ?

Thanks, Martin

nyanmisaka commented 4 months ago

@mcerveny Can you reproduce the same issue when using FFmpeg via CLI?

https://github.com/nyanmisaka/ffmpeg-rockchip/wiki/Video-Transcode#mpp-decode--mpp-encode-fastest

mcerveny commented 4 months ago

Probably it is visible after all cleanup but before exit, so it can be possible in crafted ffmpeg code with sleep(100000) before exit. Now I must finish the code/application with workaround (restart application after ~500 cycles). I will try next week to prepare some minimum code to demonstrate this behavior. For reference (out of FD descriptor), usual error:

 RgaBlit(1485) RGA_BLIT fail: Too many open files
 RgaBlit(1486) RGA_BLIT fail: Too many open files
handl-fd-vir-phy-hnd-format[0, 53, (nil), (nil), 0, 2560]
rect[0, 0, 2304, 1296, 2304, 1296, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
handl-fd-vir-phy-hnd-format[0, 1023, (nil), (nil), 0, 2560]
rect[0, 0, 720, 406, 768, 406, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
This output the user parameters when rga call blit fail
[hwscale @ 0x55822b8080] RGA blit failed: -24

And output from /proc/PID/fd before fail:

/proc/.../fd# ls -l | awk '{ print $NF; }' | sort | uniq -c
      1 0
    698 anon_inode:sync_file
      4 /dev/dma_heap/cma
      4 /dev/dma_heap/system
      4 /dev/dri/card0
      1 /dev/mpp_service
      3 /dev/pts/0
      1 /dev/rga
    207 /dmabuf:350251-main
      1 /share/cam/0A210205
      1 /share/cam/0A210205/186abd710ca.ts
      1 socket:[780242]
nyanmisaka commented 4 months ago
 RgaBlit(1485) RGA_BLIT fail: Too many open files
 RgaBlit(1486) RGA_BLIT fail: Too many open files

This error only occurs when async RGA is used but the out_fence_fd returned is not invalidated by the user, so the FD resource will be exhausted quicky. I encountered it in earlier debugging and development, but it shouldn't be present in scale_rkrga now.

BTW what's your SoC model, Linux kernel version and MPP/RGA libs commit date?

mcerveny commented 4 months ago
nyanmisaka commented 4 months ago

Then it should all work fine. I may need to wait for a demo from you to see what's going on.

mcerveny commented 4 months ago

https://github.com/nyanmisaka/rk-mirrors/blob/jellyfin-rga/core/NormalRga.cpp#L1485

nyanmisaka commented 4 months ago

Probably it is visible after all cleanup but before exit, so it can be possible in crafted ffmpeg code with sleep(100000) before exit. Now I must finish the code/application with workaround (restart application after ~500 cycles). I will try next week to prepare some minimum code to demonstrate this behavior. For reference (out of FD descriptor), usual error:

 RgaBlit(1485) RGA_BLIT fail: Too many open files
 RgaBlit(1486) RGA_BLIT fail: Too many open files
handl-fd-vir-phy-hnd-format[0, 53, (nil), (nil), 0, 2560]
rect[0, 0, 2304, 1296, 2304, 1296, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
handl-fd-vir-phy-hnd-format[0, 1023, (nil), (nil), 0, 2560]
rect[0, 0, 720, 406, 768, 406, 2560, 0]
f-blend-size-rotation-col-log-mmu[2560, 0, 0, 0, 0, 0, 1]
This output the user parameters when rga call blit fail
[hwscale @ 0x55822b8080] RGA blit failed: -24

And output from /proc/PID/fd before fail:

/proc/.../fd# ls -l | awk '{ print $NF; }' | sort | uniq -c
      1 0
    698 anon_inode:sync_file
      4 /dev/dma_heap/cma
      4 /dev/dma_heap/system
      4 /dev/dri/card0
      1 /dev/mpp_service
      3 /dev/pts/0
      1 /dev/rga
    207 /dmabuf:350251-main
      1 /share/cam/0A210205
      1 /share/cam/0A210205/186abd710ca.ts
      1 socket:[780242]

@mcerveny Hopefully the commit 377fa2c will fix this issue. Let me know if it helps.

mcerveny commented 3 months ago

Yes, it partially works, "non_inode:sync_file" is gone but dmabuf:*-main remains (encoder leak).

nyanmisaka commented 3 months ago

Yes, it partially works, "non_inode:sync_file" is gone but dmabuf:*-main remains (encoder leak).

@mcerveny From my understanding, MPP async encoding requires rkmppenc to retain references to some input frames, which will be notified by MPP callbacks and thus be dynamically released. But when codec->close() is called, these reserved frames will be released unconditionally.

Can you help me trace in which rkmppenc function the leak occurred? https://github.com/nyanmisaka/ffmpeg-rockchip/blob/d43f4f54e6c732cd47d5e1ab69b600afd8966897/libavcodec/rkmppenc.c#L800

mcerveny commented 3 months ago

It seems to be related to #35. The problem is gone with my quick patch. (tested with loop 100 frames input * 2000 (different video segments)).

nyanmisaka commented 3 months ago

Closed by 7a0200b