blaind / xrbevy

Proof-of-concept of getting OpenXR rendering support for Bevy game engine using gfx-rs abstractions
64 stars 4 forks source link

Poor performance - FPS halved from what's possible #5

Closed blaind closed 3 years ago

blaind commented 3 years ago

It seems that currently the reported FPS numbers are exactly 50% of what the devices are capable. E.g. on oculus quest 2 with 72fps the example runs at 36fps, and with quest 2 set to 90fps, example reports 45 fps.

Also detectable with monado:

 WARN [log_frame_time_diff] Frame late by 16.67ms!
 WARN [log_frame_time_diff] Frame late by 16.67ms!
 WARN [log_frame_time_diff] Frame late by 33.33ms!
 WARN [log_frame_time_diff] Frame late by 16.67ms!
 WARN [log_frame_time_diff] Frame late by 33.33ms!

Possibly every 2nd frame is waited for, but not rendered?

blaind commented 3 years ago

Current loop

OpenXR call XR Duration Explanation
frame_waiter.wait() xrWaitFrame 6-20ms? XR: xrWaitFrame throttles the application frame loop in order to synchronize application frame submissions with the display. xrWaitFrame returns a predicted display time for the next time that the runtime predicts a composited frame will be displayed. The runtime may affect this computation by changing the return values and throttling of xrWaitFrame in response to feedback from frame submission and completion times in xrEndFrame.
frame_stream.begin() xrBeginFrame <0.1ms XR: xrBeginFrame is called prior to the start of frame rendering
[openxr_event_system] <3ms? Getting XR view positions, etc. (in openxr_event_system)
sc_handle.wait_image() xrWaitSwapchainImage <0.1ms XR: Before an application can begin writing to a swapchain image, it must first wait on the image to avoid writing to it before the compositor has finished reading from it.
wgpu.update() 20ms Bevy wgpu render graph execution
sc_handle.release_image() xrReleaseSwapchainImage <0.5ms XR: Once an application is done submitting commands that reference the swapchain image, the application must release the swapchain image.
frame_stream.end() xrEndFrame <0.1ms XR: xrEndFrame may return immediately to the application. XrFrameEndInfo::displayTime should be computed using values returned by xrWaitFrame. The runtime should be robust against variations in the timing of calls to xrWaitFrame, since a pipelined system may call xrWaitFrame on a separate thread from xrBeginFrame and xrEndFrame without any synchronization guarantees.

Based on debug info:

===============================================================
frame_waiter.wait() at 417.012108ms
    display_time=709282124, duration=16.666666ms at 429.275358ms
frame_stream.begin() at 429.283293ms
    done at 429.305484ms
wait_image at 431.493454ms
   done 431.509183ms
Running wgpu graph
release_image at 450.310183ms
frame_stream.end at 450.401544ms display_time=709282124 duration=16.666666ms
    done at 450.451127ms
===============================================================
frame_waiter.wait() at 451.382558ms
    display_time=742615456, duration=16.666666ms at 461.253827ms
frame_stream.begin() at 461.259017ms
    done at 461.274676ms
wait_image at 463.857143ms
   done 463.875607ms
Running wgpu graph
release_image at 484.933424ms
frame_stream.end at 485.025717ms display_time=742615456 duration=16.666666ms
    done at 485.076452ms
===============================================================
frame_waiter.wait() at 486.031056ms
    display_time=775948788, duration=33.333332ms at 492.975004ms
frame_stream.begin() at 492.979001ms
    done at 492.991354ms
wait_image at 495.794023ms
   done 495.809842ms
Running wgpu graph
release_image at 514.039453ms
frame_stream.end at 514.151412ms display_time=775948788 duration=33.333332ms
    done at 514.211164ms
===============================================================
frame_waiter.wait() at 515.135221ms
    display_time=809282120, duration=33.333332ms at 525.558232ms
frame_stream.begin() at 525.563171ms
    done at 525.57868ms
wait_image at 528.510129ms
   done 528.531779ms
Running wgpu graph
release_image at 550.730609ms
frame_stream.end at 550.837138ms display_time=809282120 duration=33.333332ms
    done at 550.887682ms
===============================================================

It seems that display_time difference between frames is ~33ms. Maybe what happens is that runtime detects skipped frame (took too long to render?), and returns the next time after skipped frame? Since this happens multiple times, eventually runtime reports the expected duration as 33ms instead of 16ms. Seems to occur ~5 frames after startup.

blaind commented 3 years ago

Crucial part in OpenXR spec:

blaind commented 3 years ago

This is no longer as big of a problem - there was some changes to lifecycle earlier, and bevy performance may also have improved.