GPUOpen-LibrariesAndSDKs / AMF

The Advanced Media Framework (AMF) SDK provides developers with optimal access to AMD devices for multimedia processing
Other
608 stars 152 forks source link

[Question] How 'low-latency-mode' affect encoding latency? #500

Closed MemeTao closed 1 month ago

MemeTao commented 1 month ago

In following two case, encoding latency will be totally different(both of them are set low-latency mode): case 1:

// Continuous submit frame
setProperty(low-latency-mode, true);
setProperty(query-timeout, 50);
while (true) {
    auto t1 = cur_time();
    submitFrame();
    queryOutput();
    auto t2 = cur_time();
    auto took_ms = t2 - t1;   **// 1080P will took 4ms on my pc;**
}

case2:

// submit frame every 16ms;
setProperty(low-latency-mode, true);
setProperty(query-timeout, 50);
while (true) {
    auto t1 = cur_time();
    submitFrame();
    queryOutput();
    auto t2 = cur_time();
    auto took_ms = t2 - t1;   **// 1080P will took 12ms on my pc;**
    sleep(16 - took_ms);
}

Both of them are low-latency mode, why the encoding latency totally different.

It can be reproduced by 'EncoderLatency' example.

MikhailAMD commented 1 month ago

Hard to tell without actual code but few random thoughts:

MemeTao commented 1 month ago

Hard to tell without actual code but few random thoughts:

* setting timeout property is static, meaning it should be done before Init() call or no effect

* Precision of Windows Sleep function or similar waits on events is very inaccurate - could be increased, see AMF samples

* The sleep time may be negative

* What is your method of getting time from accuracy perspective?

* Why don't you measure time you actually spent in sleep the same way as you measure encode time?

* What do you submit? Do you have any GPU operation other than encoder? If so, adding sleep may change clocks caused by power management on GFX or Compute queue

* If you have PDBs and record GPUVEW ETL log, you can see stacks and why you have the difference.

Thanks for such a detailed answer. I was wrong. The real problem is that low latency mode is not set.

case 1 (no latency-mode, no sleep):

res = encoder->SetProperty(AMF_VIDEO_ENCODER_LOWLATENCY_MODE, false);

image

case 2(no latency-mode, sleep):

res = encoder->SetProperty(AMF_VIDEO_ENCODER_LOWLATENCY_MODE, false);
//..
amf_sleep(16);

image

This problem can reproduced from here: https://github.com/GPUOpen-LibrariesAndSDKs/AMF/pull/501

I will see gpu-view for more details later (Study now)

MikhailAMD commented 1 month ago

But in this case seem situation is clear: if low latency parameter is false and you insert sleep after encoding, GPU power management sees that jobs submissions are spaced in time and will reduce VCN clocks and encoding will be longer, so latency increases. The main reason for adding low latency parameters is to force clocks to stay high and keep latency low.

MemeTao commented 1 month ago

But in this case seem situation is clear: if low latency parameter is false and you insert sleep after encoding, GPU power management sees that jobs submissions are spaced in time and will reduce VCN clocks and encoding will be longer, so latency increases. The main reason for adding low latency parameters is to force clocks to stay high and keep latency low.

Thanks for answer. “GPU power management” and "VCN clocks" is really new thing for me, I'll learn it right away. By the way, I found that encoding latency will very large when 3D Game running in the foreground. Is there any d3d api or something to increase gpu priority for app (just for capturing desktop and encoding) when 3d game running(take almost 100% gpu 3D resources).

MikhailAMD commented 1 month ago

Yes, game can interfere with some graphic jobs but not with encoder. To avoid this in the sample, use "-prerender" parameter as in this sample graphics is used for preparation of inputs only. Though, if your GPU/APU doesn't have ability to accept RGBA directly for encoding, the encoder will use shader based color converter. Also note, that in real use case for game streaming there is pacing of streamed frames to a certain framerate. Use "-framerate" or "-hevcframerate" parameter in the sample to emulate it. All this can be investigated in GPUVIEW.