Minimizing input lag - Githubissues

HalfVoxel commented 4 years ago

Hi

I've been trying to figure out how to minimize input lag while using wgpu and winit (which seems to be the standard). This has turned out to be tricky due to how the borrowing of swapchain images works.

My current render loop looks something like

event_loop.run(move |events, _, control_flow| {
     // Process events
     let frame = swap_chain.get_next_texture();
     // Render
}

The problem is that with vsync get_next_texture may block. Usually it blocks for 16ms (60 Hz). This results in that even if there would be input events during those 16 ms they will not be detected by the program until the next frame.

Trying to acquire the frame at the end of the event loop

let mut next_frame = ...;
event_loop.run(move |events, _, control_flow| {
     // Process events
     // Render
     next_frame = swap_chain.get_next_texture();
}

does not seem to work very well. Maybe my limited rust skills is preventing me from seeing an obvious solution, but I can not get that to work without the compiler complaining.

Using the mailbox present mode seems like it would help. However wgpu does not seem to support that (is there any reason for this?).

So the different solutions that I could see for this are

Add a method to the swap chain that blocks until an image is ready, but don't actually acquire it.
Support the mailbox present mode
Don't use vsync. This does indeed work, however it will possibly introduce tearing, which I'd like to avoid.

Does anyone have any other suggestions for how to minimize the input lag?

kvark commented 4 years ago

That's a great issue, thanks for filing! I agree that blocking for this period and not receiving any inputs is sad.

Ideally, we'd shape the API in such a way that you get an "next frame" event in the same event loop. However, this would require tying the library to winit and not entirely clear how to implement this.

Similarly, perhaps we could have a try_get_next_texture method that doesn't block. Not clear how that would map to the web target, or how you'd know when to call it.

Another idea is to go full async. We already have async for buffer mapping, we'll need it for adapter/device creation. There is no reason to leave out any blocking methods therefore. Of course, it doesn't answer the question of how the reactor would know when the frame is ready, or how it would integrate with winit's event loop, but at least it's consistent with the rest of the API.

As for the mailbox mode, it's just not universally supported on all the platforms :/

kvark commented 4 years ago

Adding to the mentioned points (not discarding them!).

Question: is your concern that within, say, 10ms of block time, you wouldn't know when exactly an input event was triggered? Or something else? Asking because technically you should be able to process events after you acquire a frame, so no inputs are lost.

Finally, as a crazy idea, you could have a render loop entirely running on a separate thread: just getting the frame and rendering it. The main event loop would then always be open for events and update some shared state with the renderer.

HalfVoxel commented 4 years ago

Thank you for responding so quickly.

I also noticed this issue which seems relevant https://github.com/gfx-rs/wgpu-rs/issues/139

Question: is your concern that within, say, 10ms of block time, you wouldn't know when exactly an input event was triggered? Or something else? Asking because technically you should be able to process events after you acquire a frame, so no inputs are lost.

No. My concern is that with winit and a blocking get_next_texture it seems impossible to structure the code so that when my code starts rendering, it will have up to date input information. All events will be about 16 ms old since the get_next_texture blocked for that amount. Technically I should be able to process events after I acquire a frame. However in rust this seems very tricky due to the borrow checker. At least I could not figure out how to do it.

I noticed this as I have some dragging functionality in my application and I noticed the dragged item was lagging behind the mouse cursor a bit. Making the application feel a bit sluggish.

Finally, as a crazy idea, you could have a render loop entirely running on a separate thread: just getting the frame and rendering it. The main event loop would then always be open for events and update some shared state with the renderer.

That is true. That would indeed solve the problem. It's not a very clean solution though :p.

grovesNL commented 4 years ago

@kvark regarding winit/others event loops + async options: recently there have been some API designs similar to this, which I thought were really interesting

The event loop/future integration are only a few lines of code in both of these crates. Both crates also work natively and on the web (within the browser's event loop).

kvark commented 4 years ago

@HalfVoxel I just quickly tried to process events after waiting, and indeed we hit a problem with how borrowing works. It seems difficult to explain to Rust how the lifetimes of a frame and the swapchain, with regards to the event loop... Something like wait_for_next_frame() function would help, as one of the ways. I believe going full async is a better solution nonetheless.

HalfVoxel commented 4 years ago

@grovesNL That blinds repo does look very interesting. That would allow me to acquire a frame before I process all the input (or to make better use of the time: 1. process inputs, 2. acquire a frame, 3. process additional inputs if any new ones had been added, 4. render). It does actually seem to do pretty much what @kvark suggested above. I.e. run the event loop on another thread and feed all events to the user controlled thread using a threadsafe queue.

@kvark Going async sounds interesting. I'm not so experienced with Rust async yet though, so I cannot tell what the consequences for the API will be. This is something that will definitely require an API change in wgpu(-rs?) though? It cannot be done right now?

rukai commented 4 years ago

The event loop has to be running on the main thread. Mac os makes this restriction but winit enforces it on all platforms.

You can also no longer send winit events across threads as of 0.21.0

HalfVoxel commented 4 years ago

The event loop has to be running on the main thread. Mac os makes this restriction but winit enforces it on all platforms.

Okay. So my wording was bad. the blinds repo runs the event loop in the main thread, but it starts another thread that it sends all the events to. It does use winit 0.20.0, so it might be incompatible with 0.21.0... That would be sad.

aloucks commented 4 years ago

As for the mailbox mode, it's just not universally supported on all the platforms :/

Mailbox seems to be the preferred mode when available. Would you be open to a PR that added Mailbox but falls back to FIFO when it isn't supported?

aloucks commented 4 years ago

I'm only seeing input lag with the DX12 backend. The vulkan backend on windows seems to be fine with or without vsync (although I'd still like to have Mailbox present mode).

kvark commented 4 years ago

If it fixes dx12, sure. It's not clear though why dx12 is presenting worse than vulkan. Looks like some investigation is needed.

On Mar 5, 2020, at 01:58, aloucks notifications@github.com wrote:

I'm only seeing input lag with the DX12 backend. The vulkan backend on windows seems to be fine with or without vsync (although I'd still like to have Mailbox present mode).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Systemcluster commented 4 years ago

I'm only seeing input lag with the DX12 backend. The vulkan backend on windows seems to be fine with or without vsync (although I'd still like to have Mailbox present mode).

If it's only the DX12 backend, it might be related to gfx-rs/wgpu#488?

aloucks commented 4 years ago

This memory leak is still present as well: https://github.com/gfx-rs/wgpu/issues/105

If I force the cube example to use DX12, memory increases rapidly.

kvark commented 4 years ago

@aloucks you've been contributing a lot of good changes lately. Do you want to help looking at this problem? I'm planning on looking at it as well soon.

aloucks commented 4 years ago

@kvark

I don't think there's any backpressure on frame acquisition.

https://github.com/gfx-rs/gfx/blob/a9e2c7ed5974844f4dc8c1f0d59b151a8260ba5c/src/backend/dx12/src/window.rs#L208-L229

If I add std::thread::sleep_ms(16) in the render loop of the examples, the memory leak dries up (or at least becomes so small that it's not obvious in task manager).

I'm not sure if this is the entirety of the problem but its certainly contributing to it.

aloucks commented 4 years ago

It looks like presentation is requesting vsync though, so I'm not too sure:

https://github.com/gfx-rs/gfx/blob/a9e2c7ed5974844f4dc8c1f0d59b151a8260ba5c/src/backend/dx12/src/window.rs#L54

aloucks commented 4 years ago

I'm really not sure how this maps to vulkan, but I think SwapEffect: dxgi::DXGI_SWAP_EFFECT_FLIP_DISCARD, probably needs to be D3DSWAPEFFECT_FLIP (at least for Fifo)

https://github.com/gfx-rs/gfx/blob/a9e2c7ed5974844f4dc8c1f0d59b151a8260ba5c/src/backend/dx12/src/device.rs#L1043

kvark commented 4 years ago

Notably, this problem can totally become irrelevant depending on how we resolve https://github.com/gfx-rs/gfx/issues/3184 . It would still be great to understand what's going on here, of course. @aloucks there is no such thing as D3DSWAPEFFECT_FLIP AFAIS.

aloucks commented 4 years ago

@kvark Sorry, that was a copy/paste error. I meant DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL (D3DSWAPEFFECT_FLIP is for D3D9).

https://docs.microsoft.com/en-us/windows/win32/direct3ddxgi/d3d10-graphics-programming-guide-dxgi#create-a-swap-chain

The docs also suggest using Present1 instead of Present.

For the record, D3D is not something I'm familiar with, I'm just posting what I've found so far :)

kvark commented 4 years ago

It's worth a try! Do you have windows to test? I also filed https://github.com/gfx-rs/gfx/issues/3193 closer to the root of the problem.

aloucks commented 4 years ago

DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL didn't seem to help, although I'm not sure if there was anything else that also needed to change.

kvark commented 4 years ago

I believe this is fixed by https://github.com/gfx-rs/gfx/pull/3239, which is now published as a patch. You can pick it up by cargo update -p gfx-backend-dx12. Please feel free to re-open if the issue persists!

gfx-rs / wgpu-rs

Minimizing input lag #172