Closed elexisvenator closed 8 months ago
I did some light investigation and here's what I'm aware of so far:
I found it easier to replicate these issues on Windows, although the only issue I wasn't able to replicate after more effort across platforms is the Transform/odd tile location issue which I observed on Windows and not Mac (I did not test Linux today, since Mac did replicate and that is close enough for me).
The underflow specifically is happening here. I've reproduced it for both directions that require subtraction from the "far side" (Right
and Up
). This function is only used in one place which is the board_shift
.
I need to investigate this a bit more to determine exactly why the underflow is occurring. Over time I've leaned more and more towards using signed integers for all my tile-based grids as this tends to make the math easier to work with. A checked_subtraction
could also be a good idea here to surface the error with more information. Most notably, I'd like to see tracing::error! used if this issue occurs with some additional information about why it is occurring.
I do not know why the tiles sometimes end up overlapping or getting sent off the entire screen. Its possible there's an interpolation bug or other issue in bevy_easings. Reducing the animation time from 100ms to 10ms has a great effect on increasing the frequency of the bundle insertion issue at least.
error[B0003]: Could not insert a bundle (of type \`bevy_easings::EasingComponent<bevy_transform::components::transform::Transform>`) for entity 34v0 because it doesn't exist in this World.
Initial testing seems to indicate that placing the render_tiles
system in the PostUpdate
schedule completely solves the bundle insert issue.
.add_systems(
Update,
(
render_tile_points,
board_shift,
new_tile_handler,
end_game,
)
.run_if(in_state(RunState::Playing)),
)
.add_systems(PostUpdate, render_tiles.run_if(in_state(RunState::Playing)))
continuing that thread, placing the render_tiles
system in the next schedule says to me that apply_deferred
needs to run before rendering. This makes sense since apply_deferred
is where Commands
are applied, and that happens between Schedule
s like this by default. I have confirmed this by using apply_deferred
and manually chaining the systems with it in between.
((
render_tile_points,
board_shift,
new_tile_handler,
end_game,
),
apply_deferred,
render_tiles
).chain()
Its notable that the documentation for PostUpdate
changed in 0.11 and now includes the following phrasing which makes me believe that the right approach in a "post-schedule v3" world is to put this rendering in PostUpdate
since it is responding to the tile updates in Update
.
PostUpdate exists to do “engine/plugin response work” to things that happened in Update.
This also looks like it is changing in 0.13 to be automatic based on system parameters (ie: Commands
), so we'll have to keep an eye on it for the next iteration of the workshop to see if updates are appropriate.
This is almost certainly an ordering issue. The logic for has_move
requires that the tiles available in the query are the same as the tiles on the board, which isn't necessarily true if there are despawn
commands that are pending.
No matter exactly how it is occurring, the end-game behavior is a fantastic example to use for showing how to test bevy apps and systems. It is both tedious to do manually and only occurs in certain conditions after a longer period of play.
The changes I'd like to make moving forward so far are:
render_tiles
to PostUpdate
and explain why in the relevant lesson. This will likely end up being in the same lesson as adding bevy_easings since the bundle issue only shows up at that time.I have more investigation to do on the underflow issue specifically, as I spent most of my time on the others here.
This sounds like typical concurrency headaches that I was hoping that a system like bevy would largely hide away from us
As far as I can tell, these are logic bugs and the system ordering issues should likely be fixed. If a tile is going to be despawned, it can not then be used in an "end game" calculation before it is despawned (but after the Position component updated) and should not have an "easing bundle" applied to move it around. I can't find a reference to link to, but my current understanding is that a Component update (like Position
) would be visible to the very next system that ran whereas a Command is only applied when apply_deferred runs, and that this is effectively the issue for end_game.
However! Within the last 3 months a try_insert
was added which could solve the issue for the easings bundle without system re-ordering.
I'm a bit conflicted about using methods like try_insert
in educational material specifically (in real-world bevy apps I assume this method could be required due to real-world constraints or preferences). On one hand it can be very useful for "apply only if the entity still exists" and on the other educational material should likely be built with correct system ordering, etc. I haven't made a decision yet for which direction I'm going to go with the workshop, although I'm currently leaning towards fixing the system order.
Happy New Year and thanks for the issue. I'll use this to improve the workshop. I'm planning to split this out into a couple issues after doing some more investigation.
While working on this, I had a realization about the game itself. 2048 is a game where the board tiles move then the new tiles are added. It sounds obvious typing this out but this ordering is inherent to the way the game works, so requiring the systems handling board-shifting, new tile handling, and even end-game detection to be done in-order is "how the game works" and not actually an optional feature of the implementation. This makes the original implementation in the workshop logically invalid, even though it mostly works.
Bevy has a really straightforward solution to this these days in a post Schedule v3 world: chain.
.add_systems(
Update,
(
board_shift,
new_tile_handler,
render_tile_points,
render_tiles,
end_game
)
.chain()
.run_if(in_state(RunState::Playing)),
I've confirmed that this in fact fixes all of the above issues, including a test that panics depending on system ordering fairly consistently (but not always) without chain, and never panics with .chain().
The aforementioned test looks like:
#[cfg(test)]
mod tests {
use bevy::ecs::system::CommandQueue;
use super::*;
#[test]
fn insert_bundle_on_removed_tile() {
let mut app = App::new();
let board = Board::new(4);
app.insert_resource(Board::new(4));
app.add_event::<NewTileEvent>();
app.init_resource::<Game>();
app.add_plugins((
EasingsPlugin,
bevy::time::TimePlugin,
));
app.add_systems(
Startup,
(setup, spawn_board).chain(),
);
app.add_systems(
Update,
(board_shift, render_tiles).chain(),
);
// insert tiles to set up a game
let mut command_queue = CommandQueue::default();
let mut commands =
Commands::new(&mut command_queue, &app.world);
spawn_tile(
&mut commands,
&board,
Position { x: 0, y: 0 },
);
spawn_tile(
&mut commands,
&board,
Position { x: 0, y: 0 },
);
command_queue.apply(&mut app.world);
// move board up
let mut input = ButtonInput::<KeyCode>::default();
input.press(KeyCode::ArrowUp);
app.insert_resource(input);
// Run systems
app.update();
// move board left
let mut input = ButtonInput::<KeyCode>::default();
input.press(KeyCode::ArrowLeft);
app.insert_resource(input);
// Run systems
app.update();
// Clear the `just_pressed` status for all `KeyCode`s
app.world
.resource_mut::<ButtonInput<KeyCode>>()
.clear();
app.update();
}
}
I added a game over test and the chain functionality in https://github.com/rust-adventure/2048/commit/1687150903f99a892ec1ee971a74cc6da874029a for the bevy 0.13 workshop edition, so I'm going to close this as it is part of the next version of the workshop now.
Hi @ChristopherBiscardi
H have just completed your 2048 course and quite enjoyed it. However, I noticed that the game has a few bugs in it that seem to be related to a multithreaded race condition of some sort.
If you run the game and spam arrow keys fast, there is a chance that the game would crash. The exact error varies but is usually either an error trying to do something with an entity that doesn't exist or a numeric overflow. I have also seen other strange behaviour, such as tiles being spawned outside of the game grid, and the game falsely entering an endgame state when moving a full board with valid moves.
This issue happens with the 0.11 version of the code from the course as well as the latest 0.12 code from this repo.
I suspect this is caused by multiple systems acting on the tiles at the same time, with issues being caused by trying to act on an entity when that existed when the system was called but either doesn't exist at that point or has had it's position moved. This sounds like typical concurrency headaches that I was hoping that a system like bevy would largely hide away from us. Is there something in the design of the game that is prone to these sort of issues, or is there a potential lesson here for how to address them?
Thanks, and happy new year, Ben