Closed kjkent closed 4 months ago
This has been dramatically improved with #9452.
Amazing, thank you
@mertalev I tested the latest containers in main as of ~12hrs ago, there's a small but noticeable (subjective) improvement in terms of GPU utilisation with HW decoding off. With hardware decoding on, transcoding fails and reverts to full CPU usage:
built with gcc 12 (Debian 12.2.0-14)
configuration: --prefix=/usr/lib/jellyfin-ffmpeg --target-os=linux --extra-version=Jellyfin --disable-doc --disable-ffplay --disable-ptx-compression --disable-static --disable-libxcb --disable-sdl2 --disable-xlib --enable-lto --enable-gpl --enable-version3 --enable-shared --enable-gmp --enable-gnutls --enable-chromaprint --enable-opencl --enable-libdrm --enable-libass --enable-libfreetype --enable-libfribidi --enable-libfontconfig --enable-libbluray --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libopenmpt --enable-libdav1d --enable-libsvtav1 --enable-libwebp --enable-libvpx --enable-libx264 --enable-libx265 --enable-libzvbi --enable-libzimg --enable-libfdk-aac --arch=amd64 --enable-libshaderc --enable-libplacebo --enable-vulkan --enable-vaapi --enable-amf --enable-libvpl --enable-ffnvcodec --enable-cuda --enable-cuda-llvm --enable-cuvid --enable-nvdec --enable-nvenc
libavutil 58. 2.100 / 58. 2.100
libavcodec 60. 3.100 / 60. 3.100
libavformat 60. 3.100 / 60. 3.100
libavdevice 60. 1.100 / 60. 1.100
libavfilter 9. 3.100 / 9. 3.100
libswscale 7. 1.100 / 7. 1.100
libswresample 4. 10.100 / 4. 10.100
libpostproc 57. 1.100 / 57. 1.100
[h264 @ 0x327ac2e0580] Reinit context to 1088x1920, pix_fmt: yuv420p
Selecting decoder 'h264' because of requested hwaccel method cuda
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'upload/library/admin/2023/12/video.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
title : 856115136307470
encoder : Lavf59.27.100
Duration: 00:00:28.21, start: 0.000000, bitrate: 4018 kb/s
Stream #0:0[0x1](und): Video: h264 (High), 1 reference frame (avc1 / 0x31637661), yuv420p(tv, bt709, progressive, left), 1080x1920 (1088x1920), 3974 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
encoder : Lavc59.37.100 h264_fbv
Stream #0:1[0x2](und): Audio: aac (HE-AAC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 48 kb/s (default)
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> hevc (hevc_nvenc))
Stream #0:1 -> #0:1 (aac (native) -> opus (libopus))
Press [q] to stop, [?] for help
[h264 @ 0x327ac2e1980] NVDEC capabilities:
[h264 @ 0x327ac2e1980] format supported: yes, max_mb_count: 65536
[h264 @ 0x327ac2e1980] min_width: 48, max_width: 4096
[h264 @ 0x327ac2e1980] min_height: 16, max_height: 4096
[h264 @ 0x327ac2e1980] Reinit context to 1088x1920, pix_fmt: cuda
[graph 0 input from stream 0:0 @ 0x327ac1a2280] w:1080 h:1920 pixfmt:cuda tb:1/15360 fr:30/1 sar:0/1
[auto_scale_0 @ 0x327ac1a2640] w:iw h:ih flags:'' interl:0
[Parsed_format_0 @ 0x327ac1a21c0] auto-inserting filter 'auto_scale_0' between the filter 'graph 0 input from stream 0:0' and the filter 'Parsed_format_0'
Impossible to convert between the formats supported by the filter 'graph 0 input from stream 0:0' and the filter 'auto_scale_0'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
[AVIOContext @ 0x327ac1d0680] Statistics: 0 bytes written, 0 seeks, 0 writeouts
Terminating demuxer thread 0
[AVIOContext @ 0x327ac1d02c0] Statistics: 179128 bytes read, 0 seeks
Conversion failed!
I see in the updated docs that HW decoding may not work for every video, but the above error repeated for all six videos in the job queue with 0% GPU usage and 100% CPU -- so I thought I'd mention here in case it's unexpected.
I was initially going to say that I just tested on main and don't have this issue, but there was a video that had the same error. The fix was just merged into main, but it'll take a bit for the new image to be built. With your current image, you can try setting a different target resolution (like 720p) to make it work.
The bug
If this is expected behavior, I apologize for the noise. It appears that ffmpeg may not be fully utilizing hardware acceleration on my machine, with an RTX 3060 Ti, as transcoding looks like it's hitting the CPU (Ryzen 3600X) far harder than the GPU. Running 6 concurrent transcoding jobs (to 1080p/HEVC/Opus), I'm seeing ~40-60% CPU use across 12 cores but only ~9% GPU utilization with 3.3G of 8G GPU memory used:
btop
:nvidia-smi
:The OS that Immich Server is running on
Arch (6.9.2-arch1-1)
Version of Immich Server
v1.105.1
Version of Immich Mobile App
N/A
Platform with the issue
Your docker-compose.yml content
Your .env content
Relevant log output
Additional information
I'm hesitant to report this as it may just be my misaligned settings. I do have another bug to report where I think the server is responding with an inappropriate status code when videos are requested from Chrome, when Immich is behind a reverse proxy with gzip encoding enabled. I'm going to verify and get more info before filing that.
I noticed that https://github.com/immich-app/immich/pull/9452/commits/3dd34280a265044a4cd9863f22ed06a12627bfc7 changes the settings for nvenc & introduces nvdec so I can see if there's any difference by bumping my container images to
main
Thank you for developing Immich -- it's an incredible undertaking, implemented incredibly well! I hope one day to contribute.