Closed singalen closed 3 years ago
I've used this code from BTerm examples:
let mut input = INPUT.lock();
if let Some(ev) = input.pop() ...
but thing is, I don't want to consume the event. I want to let BTerm
keep track of the keys, like shift
and other state like mouse_pos
.
Currently, there is no way to check if an event is available. pop()
and for_each_message()
consume events.
Can we please have peek() -> Option<BEvent>
that would return the clone of the current event?
Or for_each_message_ref(&BEvent)
?
Thank you!
Which is probably even more important – I don't see a way to put the process to sleep while waiting for the input event in input_hander.rs
.
Looks like it's bound to consume as much CPU as it can?
Hey @singalen, while we wait for @thebracket to take a look at this, I tried fix which, while not polished, isn't that awful. https://github.com/BiosElement/bracket-lib/commit/e8cd34112ade911bd6d1d460a209301b8e660e67 Currently the fps cap is hard-coded and the exit event doesn't fire correctly, but I think is just a matter of handling it properly in my game.
Basically just using winit's waituntil instead of polling, which means the sleep which adds input latency is removed. CPU usage and input latency both seem reasonable now.
The approach in BiosElement's file is really good. I didn't know winit
had a WaitUntil
function! (I wonder if that was added after I wrote the initial loop?). I'm currently testing a fix based on this approach. I'll push a branch for this issue when I've finished testing locally; I don't have easy access to a Mac to check it with.
On Windows, I'm getting some odd results - WaitUntil
waits properly, unless an input event fires - and then the program zooms at full speed. So you can chug along at a limited FPS, and speed up by wiggling the mouse. Will keep debugging, but it may be a bit longer before the branch arrives. Seems to be related to https://github.com/rust-windowing/winit/issues/1610
There's a potential fix in the branch https://github.com/amethyst/bracket-lib/tree/issue-228-high-cpu I ended up combining a few approaches:
control_flow
to WaitUntil
with a pause (unless it's going slow). On Windows, this only does something useful if you aren't doing anything that causes other events to fire - that seems to be a known issue with winit
. On my test Linux setup, it made the loop pause until the time had elapsed.I could use some testing on this, if you don't mind.
I'll be looking at this later today, but in response to the 'zooming' @thebracket , this is by design I believe. WaitUntil 'sleeps' until a certain time OR until an event triggers. So this could be considered 'good' in terms of responsiveness since I'd want the game to be responsive, but within that event trigger there should probably be tickrate limiting.
I did notice it capped at my Desktop's vsync, so this isn't the worst behavior, however things like animations would suffer from this rapidly changing fps.
That's why I went with capping the render/update FPS and WaitUntil - it should still respond to messages rapidly, but without suddenly accelerating your game. It seems to work ok on a few tests I ran, but I imagine it'll need some tweaking.
There's a potential fix in the branch https://github.com/amethyst/bracket-lib/tree/issue-228-high-cpu I ended up combining a few approaches:
I could use some testing on this, if you don't mind.
I think your solution is a bit more elegant than what I came up with, however I am getting the following: thread 'main' panicked at 'attempt to subtract with overflow', C:\Users\William\Documents\Projects\bracket-lib-228\bracket-terminal\src\hal\native\mainloop.rs:136:129
Adding some debug prints, it's trying to subtract wait_time: 33 from next_tick: 83 and failing in spectacular fashion. Since there is already a check for next_tick >= wait_time, swapping the order resolves this panic.
From a few minutes of testing, the FPS is remarkably stable, however it misses almost every keyboard input using the keyboard example and my own project.
That's why I went with capping the render/update FPS and WaitUntil - it should still respond to messages rapidly, but without suddenly accelerating your game. It seems to work ok on a few tests I ran, but I imagine it'll need some tweaking.
Indeed that makes sense, apologies if I was stating the obvious! :)
Looking more closely, it looks like TICK_TYPE is still being set to Poll which only gets overridden with WaitUntil on the 'requested' Poll timeframe. This explains why the input is so delayed. I removed the control flow declaration on like line 181 entirely and that solved the input latency, however it looks like the next_tick variable is never getting reset meaning it gets slower and slower until a few frames in, nothing ticks without input.
Trying the latest issue-228-high-cpu
. So far, CPU is still over 100%.
ControlFlow::WaitUntil
does get assigned, according to my debugger.
If I add .with_fps_cap(30.0)
to main console builder, then only a few first keystrokes gets processed after the game start, and then it freezes, with a high CPU.
I'm trying to debug the change, but failing so far. I'll write a few questions on the commit: https://github.com/amethyst/bracket-lib/commit/60a127bda6c6fc8f4088f96415c86f25617369e1
After a bit of additional research, it also looks like Winit expects OnDrawRequested to be used for static GUI's while MainEventsCleared is intended for games that are redrawing regularly. There has been some discussion with Bevy and that's the path they are leaning towards anyway.
I just pushed a couple of commits that work really well on my Windows system (I'll get to Linux later today - have publisher meetings). There are two major changes:
RedrawRequested
doesn't do anything, now.MainEventsCleared
holds the game logic.Some(details)
- which is applied (and then cleared) when the game logic runs. So resizing isn't quite as fluid - because its locked the to the game's framerate - but it also doesn't get to eat much CPU time.On Windows at least, I see CPU usage in the task manager remain constant while I zoom the window around or resize it like crazy.
On Mac, hello_terminal
from the branch still runs at 93% CPU 8(
I just pushed a commit that significantly changes the winit
event loop. Testing is very positive on my Windows setup, and input lag seems improved.
WindowEvent
checks the window's ID, because you sometimes get events for other windows - which you can drop. This appears to be more prevalent on some platforms than others.clear_input_state
is now only called at the end of a frame. I'm really not sure why it was in NewEvents
, but that seems to have been killing performance.Moved
, Resized
and so-on store the new value and don't apply it until the next frame tick.MainEventsCleared
. It checks to see if the wait_time
for a frame has elapsed before doing anything. It then starts a new instant and checks the time after the render/logic is complete. If there is remaining time before the wait_time
, it switches the control flow mode to WaitUntil
with the desired time. On Windows, that seems to not do a lot - but double-checking the timings, I'm pretty sure it's waiting for the right time period now. If the frame took longer than intended, it switches back to Poll
to ensure that it instantly starts the next frame.If that doesn't do the trick, I'm going to have to find a Mac and a profiler.
Thank you so much for your help!
Here's a profile from hello_terminal
, if it helps. I personally cannot see why the loop runs so frequently.
I added .with_fps_cap(30.0)
to it.
46 22399.0 hello_terminal (34275) :0
45 22399.0 Main Thread 0x3fe788 :0
44 libdyld.dylib 22399.0 start
43 hello_terminal 22399.0 main
42 hello_terminal 22399.0 std::rt::lang_start::h7d29aacc602424af /Users/vsergiienko/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/std/src/rt.rs:65
41 hello_terminal 22399.0 std::rt::lang_start_internal::h0c37a46739a0311d /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/std/src/rt.rs:51
40 hello_terminal 22399.0 std::panic::catch_unwind::h191bc002afc126a7 /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/std/src/panic.rs:431
39 hello_terminal 22399.0 std::panicking::try::ha45bc5eab1f103eb /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/std/src/panicking.rs:343
38 hello_terminal 22399.0 std::panicking::try::do_call::h51a4853c94b1bdea /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/std/src/panicking.rs:379
37 hello_terminal 22399.0 core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h3b22ce68aa2879c2 /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/core/src/ops/function.rs:259
36 hello_terminal 22399.0 std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h5fc701173379e5e7 /Users/vsergiienko/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/std/src/rt.rs:66
35 hello_terminal 22399.0 std::sys_common::backtrace::__rust_begin_short_backtrace::hc6b436e8e451fedf /Users/vsergiienko/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:125
34 hello_terminal 22399.0 core::ops::function::FnOnce::call_once::ha3beb17d36cecbeb /Users/vsergiienko/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:227
33 hello_terminal 22399.0 hello_terminal::main::h52d6ae211d0f4e91 /Users/vsergiienko/src/amusement/bracket-lib/bracket-terminal/examples/hello_terminal.rs:106
32 hello_terminal 22399.0 bracket_terminal::bterm::main_loop::hb257a2e5ed030ddc /Users/vsergiienko/src/amusement/bracket-lib/bracket-terminal/src/bterm.rs:1062
31 hello_terminal 22399.0 bracket_terminal::hal::native::mainloop::main_loop::he0f41eabe10c855c /Users/vsergiienko/src/amusement/bracket-lib/bracket-terminal/src/hal/native/mainloop.rs:106
30 hello_terminal 22399.0 winit::event_loop::EventLoop$LT$T$GT$::run::h29676a0a4cb1dd58 /Users/vsergiienko/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.24.0/src/event_loop.rs:154
29 hello_terminal 22399.0 winit::platform_impl::platform::event_loop::EventLoop$LT$T$GT$::run::h280f4ebe2ab3700e /Users/vsergiienko/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.24.0/src/platform_impl/macos/event_loop.rs:93
28 hello_terminal 22399.0 winit::platform_impl::platform::event_loop::EventLoop$LT$T$GT$::run_return::h65e0ecbf093392f1 /Users/vsergiienko/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.24.0/src/platform_impl/macos/event_loop.rs:106
27 hello_terminal 22399.0 objc::message::send_message::h4d3d535e8f483d3d /Users/vsergiienko/.cargo/registry/src/github.com-1ecc6299db9ec823/objc-0.2.7/src/message/mod.rs:178
26 hello_terminal 22399.0 objc::message::platform::send_unverified::h32b6afbb3ec1de3f /Users/vsergiienko/.cargo/registry/src/github.com-1ecc6299db9ec823/objc-0.2.7/src/message/apple/mod.rs:27
25 hello_terminal 22399.0 _$LT$$LP$$RP$$u20$as$u20$objc..message..MessageArguments$GT$::invoke::he09246ba4dbeac19 /Users/vsergiienko/.cargo/registry/src/github.com-1ecc6299db9ec823/objc-0.2.7/src/message/mod.rs:128
24 AppKit 22399.0 -[NSApplication run]
23 AppKit 22399.0 -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:]
22 AppKit 22399.0 _DPSNextEvent
21 HIToolbox 22399.0 _BlockUntilNextEventMatchingListInModeWithFilter
20 HIToolbox 22399.0 ReceiveNextEventCommon
19 HIToolbox 22399.0 RunCurrentEventLoopInMode
18 CoreFoundation 22399.0 CFRunLoopRunSpecific
17 CoreFoundation 22367.0 __CFRunLoopRun
16 CoreFoundation 13098.0 __CFRunLoopDoObservers
15 CoreFoundation 12562.0 __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__
14 hello_terminal 11622.0 winit::platform_impl::platform::observer::control_flow_end_handler::h33263b830f63805c /Users/vsergiienko/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.24.0/src/platform_impl/macos/observer.rs:135
13 hello_terminal 11596.0 winit::platform_impl::platform::app_state::AppState::cleared::he5339b1983da3475 /Users/vsergiienko/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.24.0/src/platform_impl/macos/app_state.rs:331
12 hello_terminal 8783.0 winit::platform_impl::platform::app_state::Handler::handle_nonuser_event::h34f85d3c91df5522 /Users/vsergiienko/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.24.0/src/platform_impl/macos/app_state.rs:173
11 hello_terminal 8347.0 _$LT$winit..platform_impl..platform..app_state..EventLoopHandler$LT$T$GT$$u20$as$u20$winit..platform_impl..platform..app_state..EventHandler$GT$::handle_nonuser_event::hdc699a8d8b2f16f3 /Users/vsergiienko/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.24.0/src/platform_impl/macos/app_state.rs:71
10 hello_terminal 8183.0 _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnMut$LT$Args$GT$$GT$::call_mut::h56b585e7e8707bb0 /Users/vsergiienko/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/boxed.rs:1553
9 hello_terminal 8142.0 bracket_terminal::hal::native::mainloop::main_loop::_$u7b$$u7b$closure$u7d$$u7d$::h3968b3924eccdd39 /Users/vsergiienko/src/amusement/bracket-lib/bracket-terminal/src/hal/native/mainloop.rs:134
8 hello_terminal 7539.0 bracket_terminal::hal::native::mainloop::tock::he668db854cec452c /Users/vsergiienko/src/amusement/bracket-lib/bracket-terminal/src/hal/native/mainloop.rs:295
7 hello_terminal 7132.0 bracket_terminal::hal::gl_common::backing::shared_main_loop::rebuild_consoles::h3c083486b0360b5b /Users/vsergiienko/src/amusement/bracket-lib/bracket-terminal/src/hal/gl_common/backing/shared_main_loop.rs:70
6 hello_terminal 7113.0 bracket_terminal::hal::gl_common::backing::simple_console_backing::SimpleConsoleBackend::rebuild_vertices::h085fd2d051e8643b /Users/vsergiienko/src/amusement/bracket-lib/bracket-terminal/src/hal/gl_common/backing/simple_console_backing.rs:136
5 hello_terminal 5998.0 bracket_terminal::hal::gl_common::backing::simple_console_backing::SimpleConsoleBackend::push_point::h563443ab4fded306 /Users/vsergiienko/src/amusement/bracket-lib/bracket-terminal/src/hal/gl_common/backing/simple_console_backing.rs:70
4 hello_terminal 5148.0 _$LT$alloc..vec..Vec$LT$T$C$A$GT$$u20$as$u20$core..ops..index..IndexMut$LT$I$GT$$GT$::index_mut::hef6550cf91188e38 /Users/vsergiienko/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/vec/mod.rs:2393
3 hello_terminal 3826.0 _$LT$alloc..vec..Vec$LT$T$C$A$GT$$u20$as$u20$core..ops..deref..DerefMut$GT$::deref_mut::h68ef25ce6e1a2c3c /Users/vsergiienko/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/vec/mod.rs:2327
2 hello_terminal 2198.0 alloc::vec::Vec$LT$T$C$A$GT$::as_mut_ptr::hb478dcefa0cc760e /Users/vsergiienko/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/vec/mod.rs:1135
1 hello_terminal 1223.0 core::ptr::mut_ptr::_$LT$impl$u20$$BP$mut$u20$T$GT$::is_null::h5fee71d7c4a78de8 /Users/vsergiienko/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mut_ptr.rs:39
I'm beginning to think that winit
timers are just plain broken. As a test, I set control flow to Now, and changed the event loop to only run on Event::NewEvents(StartCause::ResumeTimeReached{..})
. I think printed "tick" to the console when it runs, and "tock" when sufficient time has elapsed to run the frame. It then never changes control_flow.
The result is kinda alarming:
Tick
Tick
Tick
Tick
Tick
Tick
Tick
Tick
Tick
Tick
Tick
Tock
In other words, even though I only set the timer once - it ran over and over again as fast as it could! Peeking at requested_resume
on the event shows the current time, even though I never requested a resume. That seems to go against what the winit
docs have to say, which guarantee that it will be later than the requested time.
Even stranger, moving the control flow command to the top and using RedrawEventsCleared
seems to honor the display. I just went with a short delay, but it dramatically cut CPU usage on my Windows box.
With the latest change, hello_terminal
is down to 25% CPU, which is a great progress for Mac :)
It should be possible to write a model example that tests winit
timers alone?
I'm available to run a test on Mac.
It definitely sounds like a good idea to make a simpler testbed, but that'll have to wait a bit - I have limited time. What FPS does hello_terminal
example show you? On Windows, with the delay at the full wait_time
I'm seeing it fluctuate between 23 and 26 (target is 30), so it's sleeping longer than it should.
When I cap it to 30, it's similar, 21-25. Unrestricted, it's 49-60FPS, and CPU is around 70%, which is still better than it was.
Thank you. So it's behaving similarly - that means I can do the hard part of trying to make the numbers right. The uncapped CPU is going to be higher than it should, the library itself is quite inefficient at repeatedly redrawing the whole console. That part is on my "for the future" list.
When capped at 30, I'm seeing it around 20-21 FPS, however when rapid keystrokes or mouse movements occur, it locks at 30, sometimes 31. So there's certainly progress here... I'm also trying to tear about Winit to see if something is broken there as well.
The latest commit takes a different approach.
My testing showed that WaitUntil
with a 1ms delay was almost instant, and a 2ms delay could be as long as 30ms on Windows 10! I did some reading, and this article pointed me to the underlying issue: the Windows default timer really isn't very accurate at all. With a resolution of 16ms, it's not going to give me the kind of control required for a steady frame rate. Testing on Linux showed that it tended to be really accurate on timings, so I left the WaitUntil
logic in place - I may have to conditionally compile one path on Windows and one on everything else.
So I grabbed the spin_sleep
crate (it's tiny, so not too worried about dependency size) and it's dead on the money on Windows. CPU usage without sleeping is around 10% for hello_terminal
(way too high), around 1.5% with a spinning sleep in the loop - and 29-30FPS for both. I tested most examples, and walking
and walking_with_is_pressed
are both better - the latter still being the better way to achieve low latency inputs.
I then went and set the default FPS cap to 30, and adjusted the benchmark examples to request 120 instead - so they still go really fast.
I'm hoping this will work as well on other platforms. The timer precision explains why my original code (which had a thread sleep) was so flakey.
Happy to report that this version is also running very well in a virtual Ubuntu (what I have handy). CPU usage is way down, and frame-rate is about right (not perfect, but my VirtualBox setup on this PC is also mind-numbingly slow). My virtual hackintosh for testing is refusing to boot at all, so untested there. I really need to pickup a Mac Mini or something for testing.
Apologies I've been so quiet, work has been hectic but I've been following along and doing research/testing fix ideas on my own as well with little success.
I can say the latest 228 looks to be locked at 30 for hello_terminal, but drops 1-2 fps during mouse movement/input, so nearly perfect. Also, the input_harness is locked at about 30, but also handling keyboard inputs without noticeable latency.
Finally, and almost as importantly as the input latency, CPU usage is around 2%, down from 10% previously, so great work and thank you so much for helping with this mind-bending issue @thebracket @singalen ! 👍
Thank you! (And don't worry about being quiet; I have long periods of work keeping me busy). I'm not sure what I can about the small drop from input (downside of sleeping, the whole input thread is waiting - so it doesn't know to wake up for events and processes them all at once when tick-time arrives) - so I'll call this good enough for now and start getting it merged into the mainline. I have quite a few other issues I want to look at before rolling out the next crate version.
Thanks for all your help, glad we've got the CPU under control!
Ok, this is merged and passing all the tests I threw at it. Most importantly, the Hands-on Rust and roguelike tutorials work with it. In particular, the final dungeon crawler example from Hands-on Rust drops CPU usage from 12% (I have 8 cores, so basically pegging one core) to 1.5%.
I'll go ahead and call this a win and close the issue. :-)
Looks like on Mac my game is running the main loop as fast as it can. What can I do about it? How can I help to diagnose it? Ideally, it should only redraw in response to keyboard and, optionally, mouse events.