raspberrypi / firmware

This repository contains pre-compiled binaries of the current Raspberry Pi kernel and modules, userspace libraries, and bootloader/GPU firmware.
5.2k stars 1.68k forks source link

dispmanx updates on RPi 4 take double the time #1154

Open kapetanos opened 5 years ago

kapetanos commented 5 years ago

When running dispmanx updates on a dispmanx layer on a RPi 4 (on a 60Hz monitor 1920x1080) I am getting 30 frames per second.

Using the same code on previous models (e.g. RPi 3B+) it works as expected, running at 60 frames per second.

I am using the latest Raspbian Buster. On both occasions the FKMS driver is enabled.

You can check the issue with the following gist . It is just a hello_dispmanx with a few changes to count the frames per second.

Any ideas?

popcornmix commented 5 years ago

Just to confirm can you report output of tvservce -s

kapetanos commented 5 years ago

state 0xa [HDMI CUSTOM RGB lim 16:9], 1920x1080 @ 60.00Hz, progressive

6by9 commented 5 years ago

This needs review internally, and some more digging into how the hardware is behaving.

Due to a quirk in the way the hardware was behaving, the loading of the display list into the live HVS registers moved from an interrupt approx 32 lines before the end of the frame, to the start of frame. This means that your dispmanx_update_submit_sync has almost certainly missed the slot for the current frame as you only get the vblank interval. Your data therefore isn't on the screen until the end of the next frame, and dispmanx_update_submit_sync is doing the correct thing by waiting until it is on screen. This is now later than you are expecting.

We need to get a basic test case up and running and use simulations to try to convince the old interrupt to work reliably.

kapetanos commented 5 years ago

Thank you for the answer. Any rough estimation on when this will be checked / fixed?

6by9 commented 5 years ago

The issue observed was only if you had more than one channel of the HVS running simultaneously, eg dual displays, or one display and the transposer. We've been discussing whether there is the potential to add a config.txt flag to drop back to the old EOLn interrupt on the basis that you don't use 2 channels, but how you enforce that isn't clear. Building a full understanding of what is going on in the hardware is likely to take at least a couple of weeks.

squidrpi commented 5 years ago

Exact same issue here. Some more sample code if you want which shows this issue here and here Haven't had chance to try my emulators (on the same git repo) mame4all-pi, pifba, pisnes yet but they use a similar method with dispmanx.

6by9 commented 5 years ago

We're testing a potential fix at the moment. It certainly fixes the original test case, but there are a few more tests required before release.

squidrpi commented 5 years ago

Much better in 4.19.56 and the 60fps sync is now OK but there is some occassional random stutters that I don't see on previous RPIs. Run pibounce and watch a raspberry for a few minutes, you should see a missed frame occassionally.

Brunnis commented 5 years ago

I wonder if this issue is related to a peculiar one I've been observing: I'm testing RetroArch using OpenGL on Buster Lite. With RetroArch's stock settings and before running rpi-update, I got severe tearing and some stuttering. The tearing was of a strange kind with at least two tearlines, seemingly showing one frame, then another, then the other frame again.

Yesterday, I ran rpi-update and got 4.19.57. After this update, everything worked fine with default RetroArch settings. I then changed RetroArch's video_max_swapchain_images setting from default 3 to 2. This is a latency reducing setting that instructs RetroArch to render a frame, submit it to the GPU and wait for page flip before starting to render the next frame. With this setting active, the original problem came back, i.e. severe tearing.

It should also be mentioned that with video_max_swapchain_images=2, the frame rate hovers just below 60 FPS, at around 57 FPS. With video_max_swapchain_images=3, the hardware is capable of ~150 FPS rendering the same scene.

It should be mentioned that this setting (video_max_swapchain_images) is well tested and works fine on both older Pis and x86 hardware.

I thought I'd ask if this seems related to this issue before I report it as a new issue.

popcornmix commented 5 years ago

Tearing is a known issue (e.g. it affects kodi) and is related to this. Your issue is probably the same. It is being worked on.

6by9 commented 5 years ago

Your comment of tearing with 2 buffers instead of 3 did give me a thought as to another place where issues may be occuring. I'm trying something out....

Brunnis commented 5 years ago

@6by9 What's the current status on this issue? Is it being worked on?

6by9 commented 5 years ago

Nothing is actively being done on this at present. The original issue of dispmanx updates completing at half the expected speed is resolved.

As far as I'm aware there are no issues between when the update callback is made vs when things are on/off screen, but relying on vsync may be problematic. Relying on vsync is a dubious practice anyway. There is an issue when submitting frames via DRM/KMS and when it completes them. Due to resourcing issues that is on the back burner at present though.

JeanValjean2 commented 5 years ago

Sorry for highjacking the thread but I'm doing some bare-metal stuff and I jumped when I read the phrase "Relying on vsync is a dubious practice anyway".

Am I missing something or maybe am I not understanding you ? In short : is there any other way to display something at 60Hz without using vsync ?

I'm genuinely interested since I love programming asm/C stuff on the Pis :)

6by9 commented 5 years ago

The display list for the next vsync is compiled around 32 lines before the end of the frame. If you submit an update after that point then it will be delayed until the following vsync. Therefore relying solely on vsync is dubious unless you are also sure you are early enough in the frame.

DispmanX provides an update complete callback which will only trigger when the update has actually completed, which is therefore significantly safer than solely relying on vsyncs callbacks.

squidrpi commented 5 years ago

Just a tip with time critical vsync waits, don't use usleep in the wait loop as it gets affected by interrupts. I experienced this causing random stutter on Dispmanx vsync based drawing as reported above - it appears the Pi 4 is more affected by this than Pi 2/3. Removing the usleep eliminated the stutter on the Pi 4. The other option is to use GLES and let it automatically handle the vsync drawing.

Brunnis commented 5 years ago

There is an issue when submitting frames via DRM/KMS and when it completes them. Due to resourcing issues that is on the back burner at present though.

@6by9 So is that issue the cause of the problem I described with DRM/KMS and double buffering in RetroArch (i.e. tearing)? Is there another Github issue that tracks this?

fluyup commented 5 years ago

Occassional random video stutters are still here (using a "Gert vga666 Display" on a Pi4 / fkms ). @6by9: On 22 Oct, you explained a possible reason for this behavior. Is it possible to tune the "32 lines before the end of the frame" setting? Maybe a parameter for config.txt would be useful? Thank you.

6by9 commented 5 years ago

Tweaking that 32 lines parameter is playing chicken with not getting the update done in time - it's not a solution.

Having reviewed the full KMS kernel driver it takes an alternative approach. It generates the entire dlist on every update request, and once generated it pokes the start pointer into the hardware. That removes any timing constraint from the dlist generation, but adds a fair amount more memory management into the system. It's a much more elegant solution, and I am investigating whether that can be adopted within the firmware. Being such a fundamental change it's not something undertaken lightly or without a large amount of testing though.

fluyup commented 5 years ago

OK, this doesn't sound like a solution is achievable in the near future.Until then, rpi3b(+) seems to be the solution for Retroarch. Thank you very much for your efforts.