5rahim / seanime

Open-source media server for anime and manga.
https://seanime.rahim.app
MIT License
402 stars 31 forks source link

feature request: add amd_amf support for hardware accelerated transcoding #158

Closed Faralha closed 3 days ago

Faralha commented 5 days ago

Checklist

Problem Description / Use Case

Currently there's only 3 options available: NVIDIA, intel, and VAAPI. While i know VAAPI supports amd gpu, its performance is known to be less efficient than amd's proprietary AMF library. I tried adding command flag directly with custom tab, but it doesn't work. Jellyfin works well with AMF on my machine, so i thought seanime could do the same.

Proposed Solution

Add AMD_AMF native support for hardware accelerated transcoding

5rahim commented 5 days ago

Adding a new option requires testing and I don't have the hardware to personally do that.

{
    "name": "amf",
    "decodeFlags": [
        "-hwaccel", "d3d11va",
        "-hwaccel_output_format", "d3d11"
    ],
    "encodeFlags": [
        "-c:v", "h264_amf",
        "-qp_p", "0",
        "-qp_i", "0",
        "-rc", "cqp"
    ],
    "scaleFilter": "hwupload,format=nv12,scale=%d:%d"
}

Here's a custom input some users tested (for Windows), the issue is that if it works, there seems to be a lot of playback issues and I'm not familiar enough with the specifics to troubleshoot that.

Faralha commented 5 days ago

It's still failed to transcode. Can you show me part on seanime code where this argument is handled? I want to experiment this myself.

5rahim commented 5 days ago

It's still failed to transcode. Can you show me part on seanime code where this argument is handled? I want to experiment this myself.

https://github.com/5rahim/seanime/blob/main/internal/mediastream/transcoder/hwaccel.go https://github.com/5rahim/seanime/blob/main/internal/mediastream/transcoder/videostream.go

Faralha commented 5 days ago

Quick test + analysis (ps: i'm not an ffmpeg expert)

  1. AMD AMF doesn't support d3d11 for encode. Instead, use dxva2 (it will still use CPU but small)
  2. h264_amf is still supported however. (Decode)
  3. I don't know but using the flag format=nv12,hwupload with hwupload will always fail.

So the final workaround for amd_amf is this:

{
    "name": "amf",
    "decodeFlags": [
        "-hwaccel", "dxva2",
        "-hwaccel_output_format", "dxva2"
    ],
    "encodeFlags": [
        "-c:v", "h264_amf",
        "-qp_p", "0",
        "-qp_i", "0",
        "-rc", "cqp"
    ],
    "scaleFilter": "format=nv12,scale=%d:%d"
}

And after analyzing jellyfin with amd_amf and seanime with this config, both show similar hardware usage (low cpu usage, but still uses gpu for encoding). This show that jellyfin (probably) use the same config.

It is probably not the most efficient method, but atleast it worked.

image

5rahim commented 5 days ago

Not sure if the screenshot is related but it shows that you're using direct play and not transcoding anything.

Faralha commented 3 days ago

Sorry for the confusion. Seems my app was bugged.

After looking at FFmpeg documentation AMF doesn't support hardware acceleration transcoding with -v:f filter enabled. This disable the nv12 format and scaling options, which could affect performance. A Workaround is to simply disable or not using -hwaccel_output_format d3d11 flag. This means the transcode will partially still uses CPU for video filters such as scaling etc, which (kinda) defeat the purpose of hwaccels.

Nonetheless, transcoding is very choppy and will lag the website (and for some reason will always fallback to software). Just stick to either video player or Jellyfin.

I don't know if this relevant or not, but transcoding using Jellyfin's custom ffmpeg boost performance up to 20% with the same configuration.

5rahim commented 3 days ago

Jellyfin's custom ffmpeg boost performance up to 20% with the same configuration

Good to know

Just stick to either video player

Yeah, there's really no benefit to using media streaming if you're not watching on another device. The media player integration offers more features

Closing since this won't be worked on