GPUOpen-LibrariesAndSDKs / AMF

The Advanced Media Framework (AMF) SDK provides developers with optimal access to AMD devices for multimedia processing
Other
596 stars 149 forks source link

[Request]: Provide AMF the capability to capture not only GPU-rendered frames, but also GPU-generated frames #484

Closed SaltyBet closed 3 months ago

SaltyBet commented 3 months ago

Is your feature request related to a problem? Please describe.

N/A

Describe the solution you'd like

Describe alternatives you've considered

N/A

Additional context

See above.

MikhailAMD commented 3 months ago

By design, AMF Direct Capture supposed to capture any presented frame. It doesn't matter how the frame was rendered; by DWM, a full screen app, interpolation inside a game like FSR or interpolation by driver (AFMF). If I correctly understand how LSFG captures frames and presents, it should be compatible with AMF Direct Capture as well.
AMF Direct Capture returns visible frame. If your solution wants to capture every presented frame, you can use AMF_DISPLAYCAPTURE_MODE_WAIT_FOR_PRESENT but in this mode capture framerate is driven by present to the screen (flip) after any possible interpolation. If you want constant capture framerate, use AMF_DISPLAYCAPTURE_MODE_KEEP_FRAMERATE. It will return the latest visible frame at given framerate, similarly, after any possible interpolation. If something doesn't work, please report with ZIPped short GPUVIEW log (5-10 sec) and AMF log, Note, that your solution should be fast enough to be able to process every presented frame.

SaltyBet commented 3 months ago

Thanks for replying. Good to know. Just a quick question.

Per the current documentation, the AMF Display Capture API requires root or super user privileges when running on Linux systems.

Is that requirement going to be relaxed eventually?

MikhailAMD commented 3 months ago

At this point root is required. It could be possible to avoid this in the future but with Wayland only. We will provide an update when this option is available.

SaltyBet commented 3 months ago

Thanks for the info.

I'll close this issue.

Cheers.

radugrecu97 commented 3 weeks ago

By design, AMF Direct Capture supposed to capture any presented frame. It doesn't matter how the frame was rendered; by DWM, a full screen app, interpolation inside a game like FSR or interpolation by driver (AFMF). If I correctly understand how LSFG captures frames and presents, it should be compatible with AMF Direct Capture as well. AMF Direct Capture returns visible frame. If your solution wants to capture every presented frame, you can use AMF_DISPLAYCAPTURE_MODE_WAIT_FOR_PRESENT but in this mode capture framerate is driven by present to the screen (flip) after any possible interpolation. If you want constant capture framerate, use AMF_DISPLAYCAPTURE_MODE_KEEP_FRAMERATE. It will return the latest visible frame at given framerate, similarly, after any possible interpolation. If something doesn't work, please report with ZIPped short GPUVIEW log (5-10 sec) and AMF log, Note, that your solution should be fast enough to be able to process every presented frame.

I've been trying to get AMF integrated into the latest Sunshine using https://github.com/LizardByte/Sunshine/compare/master...cgutman:LB_Sunshine:amd_directcapture so Fluid motion frames (AFMF) can be captured.

It almost works perfectly... But in AMF_DISPLAYCAPTURE_MODE_KEEP_FRAMERATE and AMF_DISPLAYCAPTURE_MODE_GET_CURRENT_SURFACE modes, there are regular intervals where the generated frames are captured out of order so it's as if I'm seeing double or a ghosted image and AMD overlay seems to be flashing on and off. Then it reverts to perfect smoothness. And back and forth. Everything is captured in order when not using AFMF.

In my setup, I was testing Red Dead Redemption 2 with AFMF. I was in fullscreen and capped the framrate with Riva Tuner Statistic Server (RTSS) to 60 fps. I specifically used RTSS because I noticed much less ghosting than when using vsync/half-rate vsync and Radeon Chill with both sliders set to 60. I have a 7800x3D CPU and 7900 XT GPU.

Now, I verified my findings with the DVR sample (I modified framerate parameter to 120) - ghosting and flashing AMD overlay. I played back the footage frame-by-frame in VLC. Some of the frames were completely out order!

Then I think I found the solution, I set the Display Capture mode parameter in DVR sample to AMF_DISPLAYCAPTURE_MODE_WAIT_FOR_PRESENT and voila! Perfectly smooth! There was no ghosting and no overlay flashing in the playback.

The only problem is, that if I use it in the Sunshine implementation, QueryOutput command fails with error code 1. Does AMF_DISPLAYCAPTURE_MODE_WAIT_FOR_PRESENT need some special setup? Perhaps regarding the DX11 device?

radugrecu97 commented 3 weeks ago

I'll add that Sunshine relies on amfrt64.dll.

Is DisplayCapture not supported in the same way in the run-time?

MikhailAMD commented 3 weeks ago

For failure in Sunshine, please enable detailed AMF logging and provide the log here.

Regarding capturing AFMF output: PRESENT mode will return from QueryOutput at exact present/flip event from full-screen game/app or DWM or internal present event from AFMF. FRAMERATE mode captures current visible frame. It is designed to have smooth framerate regardless of game framerate but can be out of sync with AFMF.

Another option: instead of using AFMF from Radeon software, Sunshine and similar apps can use AMF FRC component that will do the same job as AFMF. It can be inserted between DirectCapture and AMF encoder.

Not sure I understand the question about amfrt64.dll: This DLL is always installed into System32 folder and is the front/stub for AMF runtime. All AMF functionality works via this DLL including DisplayCapture.

radugrecu97 commented 3 weeks ago

For failure in Sunshine, please enable detailed AMF logging and provide the log here.

Regarding capturing AFMF output: PRESENT mode will return from QueryOutput at exact present/flip event from full-screen game/app or DWM or internal present event from AFMF. FRAMERATE mode captures current visible frame. It is designed to have smooth framerate regardless of game framerate but can be out of sync with AFMF.

Another option: instead of using AFMF from Radeon software, Sunshine and similar apps can use AMF FRC component that will do the same job as AFMF. It can be inserted between DirectCapture and AMF encoder.

Not sure I understand the question about amfrt64.dll: This DLL is always installed into System32 folder and is the front/stub for AMF runtime. All AMF functionality works via this DLL including DisplayCapture.

amf::AMFTrace* traceAMF;
amf_factory->GetTrace(&traceAMF);
traceAMF->SetGlobalLevel(AMF_TRACE_DEBUG);

Is that enough to enable it? I'm not seeing any output.

FRC component relies on DX12 though, and Sunshine utilizes DX11

MikhailAMD commented 3 weeks ago

In the latest public driver FRC supports DX11 natively.

For tracing to be sure: debug output: traceAMF->EnableWriter(AMF_TRACE_WRITER_DEBUG_OUTPUT, true); traceAMF->SetWriterLevel(AMF_TRACE_WRITER_DEBUG_OUTPUT, AMF_TRACE_DEBUG); File output: traceAMF->EnableWriter(AMF_TRACE_WRITER_FILE, true); traceAMF->SetWriterLevel(AMF_TRACE_WRITER_FILE, AMF_TRACE_DEBUG); traceAMF->SetPath(L"c:\log\");

radugrecu97 commented 3 weeks ago

In the latest public driver FRC supports DX11 natively.

For tracing to be sure: debug output: traceAMF->EnableWriter(AMF_TRACE_WRITER_DEBUG_OUTPUT, true); traceAMF->SetWriterLevel(AMF_TRACE_WRITER_DEBUG_OUTPUT, AMF_TRACE_DEBUG); File output: traceAMF->EnableWriter(AMF_TRACE_WRITER_FILE, true); traceAMF->SetWriterLevel(AMF_TRACE_WRITER_FILE, AMF_TRACE_DEBUG); traceAMF->SetPath(L"c:\log\");

Here's the error.

2024-08-22 18:12:11.013     13F0 [AMFScreenCaptureImpl]   Debug: AMFScreenCaptureImpl::Init()
2024-08-22 18:12:11.013     13F0 [AMFScreenCaptureImpl]   Debug: AMFScreenCaptureImpl::Terminate()
2024-08-22 18:12:11.013     13F0 [AMFScreenCaptureEngineImplDX]    Info: ReloadTextures() new format=DXGI_FORMAT_B8G8R8A8_UNORM(87) (1920x1080) monitor=0 m_ScreenID=0 FrameCount = 0
2024-08-22 18:12:11.013     13F0 [AMFScreenCaptureEngineImplDX11]   Error: c:\constructicon\builds\gfx\six\24.20\drivers\amf\stable\runtime\src\components\ScreenCapture\DX11\ScreenCaptureEngineDX11.cpp(684):COM failed, HR = 0x80004001:SetDccSupport(disable) failed

Good to know about the latest release. It made me realize that sunshine isn't using the latest headers so I'm in the process of getting the new ones.

MikhailAMD commented 3 weeks ago

The last error trace is ignored in AMF. It is present in DVR sample and it works. There are no traces after? Try to disable asserts: amf::AMFDebug* debugAMF; amf_factory->GetDebug(&debugAMF); debugAMF->AssertsEnable(false);

radugrecu97 commented 3 weeks ago

Ah, I managed to figure it out. Turned out that when using AMF_DISPLAYCAPTURE_MODE_WAIT_FOR_PRESENT, the initial calls to QueryOutput return failure, and the error was handled by reinitializing the whole component, causing a constant failure loop.

Now, in the DVR sample, QueryOutput never returned failure on first try in the same mode. I'm not sure what's being done differently and if it can impact something else in the backend...

Anyway, I now have a pretty smooth experience with AFMF being captured and will experiment further to figure out the occasional minor frame drops on the stream. Although there's no error on QueryOutput and I'm running the desired framerate with the plenty of performance overhead. Perhaps integrating Smart Access Video to ffmpeg would do something.

I'll try to experiment interpolating a different framerate to check if it's a performance issue.

radugrecu97 commented 3 weeks ago

I'll get the logs for the other issues - out of order frames in the 2 modes and upload a video sample as well.

I also briefly tried the FRC component but was getting error 4 during init. I can get logs later today. I'm on the latest AFMF 2 Preview driver.

    // Create the FRC component
    amf::AMFComponentPtr frcComp;
    result = amf_factory->CreateComponent(context, AMFFRC, &(frcComp));
    if (result != AMF_OK) {
      BOOST_LOG(error) << "CreateComponent(AMFFRC) failed: "sv << result;
      return -1;
    }

    frcComp->SetProperty(AMF_FRC_ENGINE_TYPE, FRC_ENGINE_DX11);
    frcComp->SetProperty(AMF_FRC_MODE, FRC_ON);
    frcComp->SetProperty(AMF_FRC_INDICATOR, true);
    frcComp->SetProperty(AMF_FRC_PROFILE, FRC_PROFILE_HIGH);
    frcComp->SetProperty(AMF_FRC_MV_SEARCH_MODE, FRC_MV_SEARCH_NATIVE);

    // Initialize capture
    result = frcComp->Init(amf::AMF_SURFACE_P010, 0, 0);
    if (result != AMF_OK) {
      BOOST_LOG(error) << "FRCComp::Init() failed: "sv << result;
      return -1;
    }

Also, even with the DVR sample, the cursor is not visible unless it's of a very large size. When I was streaming onto a device with 2 active monitors total enabled, the cursor was possible to be recorded at a smaller size but still not the smallest. As soon as I deactivated 1 of the monitors, it disappeared from the stream and needed the very large size again to be recorded.

Of course, I can make separate issues about this, I'm first trying to clarify if it's user error.

MikhailAMD commented 3 weeks ago

FRC is working with 8-bit surfaces only. P010 is not supported.

AMF DirectCapture does not include cursor. The assumption is that it can be captured using Win32 APIs and streamed separately. If you see cursor in the captured frame, it is a bug in the driver or very unusual display setup where cursor is not HW.

radugrecu97 commented 3 weeks ago

FRC is working with 8-bit surfaces only. P010 is not supported.

AMF DirectCapture does not include cursor. The assumption is that it can be captured using Win32 APIs and streamed separately. If you see cursor in the captured frame, it is a bug in the driver or very unusual display setup where cursor is not HW.

Ah, shame about 8-bit. I'm a bit confused about the Video Convertor can it do DisplayCapture HDR to HDR encoding?

Sunshine does have a some cursor overlay component. I'll look into why it only works on certain sizes.

MikhailAMD commented 3 weeks ago

Converter can do all sorts of conversions, HDR to SDR and opposite but the best thing is that you can submit captured RGBA surface directly to encoder bypassing conversion. It supports HDR and SDR RGBA: RGBA8, BGRA8, RGBA_F16, R10G10B10A2.

cgutman commented 3 weeks ago

Sunshine does have a some cursor overlay component. I'll look into why it only works on certain sizes.

Sunshine display capture implementations are expected to be able to provide frames that (optionally) have the cursor included. For WGC capture, that is done by simply asking the OS to include the cursor in the captured frames. For DXGI DDA capture, that requires rendering the hardware cursor image into the frames ourselves. An AMF DirectCapture implementation would require something like DDA does where it renders the cursor into the frame before handing it off for encoding.

For cases with unusual cursor sizes, you are seeing the cursor probably because it falls outside the size limits for hardware cursor rendering. You will also see the cursor when dragging windows around since Windows switches to software cursor rendering for that case to keep the cursor drawing in sync with the window being moved.

ns6089 commented 3 weeks ago

AMF DirectCapture does not include cursor. The assumption is that it can be captured using Win32 APIs and streamed separately.

@MikhailAMD But doesn't this make perfect cursor-to-frame synchronization for screen recording software basically unattainable? Capture backend is the exclusive holder of this synchronization data, if you don't expose it then it gets lost.

MikhailAMD commented 3 weeks ago

This is a question of performance and interaction with game on GPU. Modern displays engines handle cursor bitmap in a separate plane and any effort to merge it into captured frame requires GPU processing. Note, that you cannot draw cursor on the captured frame if the frame represents visible screen like in AMF DirectCapture so one would be forced to make a surface copy. DD API makes copy anyway. In any case, the copy and blit of cursor would be made by GPU job, by OS or by streaming app and it will interfere with game. So passing cursor bitmap separately is more efficient. Saying that, sometimes it is convenient to embed cursor into captured frame. We are debating if we should introduce such feature in DirectCapture.

radugrecu97 commented 2 weeks ago

Converter can do all sorts of conversions, HDR to SDR and opposite but the best thing is that you can submit captured RGBA surface directly to encoder bypassing conversion. It supports HDR and SDR RGBA: RGBA8, BGRA8, RGBA_F16, R10G10B10A2.

I'm having several issues trying to integrate this with Sunshine. There's way too many mismatched pixel formats between FFmpeg, AMF and Sunshine.

In SDR, DisplayCapture says my monitor is AMF_SURFACE_BGRA.

Sunshine utilizes AVFrames, and amfencode.c in FFnoeg maps it to { AV_PIX_FMT_BGR0, AMF_SURFACE_BGRA }.

Problem is, hwcontext_d3d11va.c doesn't support AV_PIX_FMT_BGR0, only AV_PIX_FMT_BGRA. So I'm in a deadlock at the moment.

So, I'll probably try some patches like this next, so the Video Convertor would support 10bit.

MikhailAMD commented 2 weeks ago

You can add AV_PIX_FMT_BGRA to AMF_SURFACE_BGRA mapping. Under the hood it is the same DXGI format.

radugrecu97 commented 1 week ago

Converter can do all sorts of conversions, HDR to SDR and opposite but the best thing is that you can submit captured RGBA surface directly to encoder bypassing conversion. It supports HDR and SDR RGBA: RGBA8, BGRA8, RGBA_F16, R10G10B10A2.

But Sunshine uses specific pixel formats -> NV12 and P010. I don't see how I can go without the Video Convertor.

MikhailAMD commented 1 week ago

Converter can do all sorts of conversions, HDR to SDR and opposite but the best thing is that you can submit captured RGBA surface directly to encoder bypassing conversion. It supports HDR and SDR RGBA: RGBA8, BGRA8, RGBA_F16, R10G10B10A2.

But Sunshine uses specific pixel formats -> NV12 and P010. I don't see how I can go without the Video Convertor.

That's the point: to get advantage of this feature one would need to change pipeline removing color converter. We may need to update DVR sample showcasing this but you can check EncoderLatency sample: it has an option to generate or read RGBA surfaces and submit to encoder directly without color converter as long as scaling is not needed. One can do the same with captured surfaces: AMF capture or DD.