time4tea / gopro-dashboard-overlay

Programs to process GoPro MP4 & Generic GPX/FIT files and create video dashboards & maps
GNU General Public License v3.0
381 stars 52 forks source link

Support AMD GPU encoding #45

Open tve opened 2 years ago

tve commented 2 years ago

I just got a new box with an integrated AMD GPU. Of course getting hardware encode/decode to work cost a bunch of hair off my head... I'm using linux, it's possible that the 'AMF' drivers on windows make things easier, dunno.

The simplest profile settings ASFAIK are:

  "vaapi": {
    "input": ["-hwaccel", "vaapi"],
    "output": ["-vcodec", "h264_vaapi"]
  },

The result is:

Impossible to convert between the formats supported by the filter 'Parsed_overlay_0' and the filter 'auto_scale_1'

which is ffmpeg's way to say that the output of the overlay filter can't be piped like that into the h264_vaapi encoder 'cause the latter expects the frame to be in the hardware/gpu. What's needed is a 'hwupload' filter that does the upload to the gpu memory. E.g., in FFMPEGOverlay instead of

            "-filter_complex", f"[0:v][1:v]overlay{filter_extra}",

it needs

            "-filter_complex", f"[0:v][1:v]overlay{filter_extra},hwupload",

nice, eh?

BTW, I noticed that ffmpeg has an overlay_vaapi filter, so this would mean the decoded video frames would stay in gpu, get overlaid, and then encoded. Sadly the AMD vaapi driver doesn't support that... I believe the Intel one might.

time4tea commented 2 years ago

This is great info, thanks for the detective work. I'll see if I can find a way to introduce this to the profile concept. Ffmpeg is super powerful but it doesn't seem to abstract the complexity away sometimes...

tve commented 2 years ago

Every time I need to do something different with ffmpeg I have to spend time looking up docs, blogs, and stackexchange...

After digging into it, I'm not sure whether it's worth pursuing the vaapi encoding. The quality I'm getting is crap. The only really useful way to use it (for me) is with constant-quality mode (-qp flag). The default quality (-qp 20) is good, but the file size is ~2.5x the original. I find -qp 23 at the limit of what I'd accept (stuff just starts to get soft) and the file size is still ~2x the original. -qp 24 is noticeably soft and the file size is still 2x.

Compared to libx264... the very-fast preset you use by default is 20% slower than using vaapi encoding (vaapi takes the same time regardless of setting), produces a file that is a bit over half the original, and the quality is decent, perhaps similar to -qp 23 above. Using preset super fast is faster than vaapi for me and produces a file between very-fast and the original. Then there's also ultra-fast, which is good quality, very fast, but produces a file a tad larger than the original.

The one big caveat is that I'm using an AMD Ryzen 9 5900HX with integrated GPU. I don't know and have not found any info on how the video encoding block on that iGPU compares to those on higher-end discrete AMD GPUs. I also don't know what limitations the Linux VAAPI driver has vs. the actual HW capabilities that may be accessible on Windows. I do find complaints that the AMD HW doesn't produce B-frames, which I can verify looking at the files produced.

Any way, I'm planning to use the VAAPI decoding (no quality harm there) and then the libx264 ultra-fast preset to get an initial rendering and then redo using the medium preset. (Medium produces great quality, files ~60% the size of the original, but takes 2x as long as the very fast preset.)

time4tea commented 2 years ago

Based on your comment, "I'm not sure whether it's worth pursuing the vaapi encoding" - I wasn't planning to do anything with AMD GPU support. It that's not what you meant, please let me know! In any case, I don't have an AMD GPU, so I'd be relying on you completely for implementation information... :-)

DemiMarie commented 1 year ago

@tve what quality can one get with libx264 with the same file size as -qp 20?

time4tea commented 1 year ago

I dont know if anyone has experimented much with AMD GPU settings, but if there are recommendations, I'd be happy to include them in the documentation. I dont have an AMD GPU so can't offer much, I'm afraid...

time4tea commented 1 year ago

@tve I revisited the vaapi config a little bit, and I think that I can make the config possible, by adding an optional "filter" parameter to the profile. Did you have any success with getting vaapi to work? What parameters did you use? Thanks!!

tve commented 1 year ago

I did not pursue vaapi further after my last comment above. I'm using libx264 and the veryfast setting.

time4tea commented 1 year ago

Since 0.93.0, and support for the input/filter/output settings in the "profiles" configuration, this should be possible.

I dont know the settings, but check the PERFORMANCE_GUIDE doc, and the same sort of thing should work for vaapi...

paxunix commented 9 months ago

Just wanted to confirm that the "filter" property in profiles does indeed let you use vaapi. I'm currently using:

{
  "vaapi": {
    "input": [
      "-hwaccel", "vaapi",
      "-hwaccel_device", "/dev/dri/renderD128",
      "-hwaccel_output_format", "vaapi"
    ],
    "filter": "[1:v]format=rgba,hwupload[overlay];[0:v][overlay]overlay_vaapi",
    "output": [
      "-vcodec", "h264_vaapi",
      "-movflags", "faststart"
    ]
  }
}
time4tea commented 9 months ago

This is fantastic info. Thank you for sharing it.

igutidze commented 3 months ago

Unfortunately, AMD driver on Linux does not support overlay_vaapi filter, therefore this is what I'm using as an alternative to full vaapi pipeline:

{
  "vaapi": {
    "input": ["-hwaccel", "vaapi", "-hwaccel_output_format", "vaapi"],
    "filter": "[0:v]hwdownload,format=nv12[a],[a][1:v]overlay,hwupload",
    "output": ["-c:v", "hevc_vaapi", "-b:v", "25M"]
  }
}

It should work on all platforms. Shall I create a PR?

time4tea commented 3 months ago

Hi. This is great info. Maybe can call it vaapi-linux ? Then it can be added to the built in profiles. Don't worry about a pr, I can add it.

Thanks!

igutidze commented 3 months ago

@time4tea as you see fit!

igutidze commented 3 months ago

Btw, you may use both h264_vaapi and hevc_vaapi, I just preferred HEVC in my particular case

RaveGun commented 2 months ago

Hello, I've read all the above and I am still not sure if it is possible to use the HW acceleration just for generating the overlay.

I tried all the above overlay configurations and still get the: Impossible to convert between the formats supported by the filter 'Parsed_overlay_0' and the filter 'auto_scale_1' error.

I am on Ubuntu 24.04 with a 6650TX Radeon.

Can the generating of the overly be accelerated?

Thanks.

igutidze commented 2 months ago

@RaveGun what command line are you using and where do you add the ffmpeg configuration?

RaveGun commented 2 months ago

@igutidze I have a file created at this location: ~/.gopro-graphics/ffmpeg-profiles.json

And the command line is: venv/bin/gopro-dashboard.py --units-speed kph --use-gpx-only --gpx=../GPX/25.08.2024.gpx --layout-xml=layout.xml --profile=overlayhw --overlay-size=1920x1080 25.08.2024_lres.mp4

And the overlyhw has cahanged many times. Currently it is like this:

  "overlayhw": {
    "input": [],
    "filter": "[0:v]hwdownload,format=nv12[a],[a][1:v]overlay,hwupload",
    "output": ["-vcodec", "h264_vaapi", "-q:v", "65"]
  }, 

I am not happy with the performance of the system anyhow on editing any 4k Videos. I am not a content creator so this will be a one time event, that I will have to use the computer for 4k video editing.

igutidze commented 2 months ago

@RaveGun please add --show-ffmpeg command line switch and paste the full ffmpeg execution options here. It should look something like this:

Executing [PosixPath('bin/ffmpeg'), '-y', '-hide_banner', '-loglevel', 'info', '-hwaccel', 'vaapi', '-hwaccel_output_format', 'vaapi', '-i', '/home/irakli/Videos/GH010098.MP4', '-f', 'rawvideo', '-framerate', '10.0', '-s', '2704x1520', '-pix_fmt', 'rgba', '-i', '-', '-filter_complex', '[0:v]hwdownload,format=nv12[a],[a][1:v]overlay,hwupload', '-c:v', 'h264_vaapi', '-b:v', '25M', '-movflags', 'faststart', '/home/irakli/Videos/champ-dashboard.mp4']

RaveGun commented 2 months ago

I did the test:

Executing ['ffmpeg', '-hide_banner', '-y', '-hide_banner', '-loglevel', 'info', '-f', 'rawvideo', '-framerate', '10.0', '-s', '1920x1080', '-pix_fmt', 'rgba', '-i', '-', '-r', '30', '-vcodec', 'h264_vaapi', '-q:v', '65', '25.08.2024_lres.mp4']