bevyengine / bevy

A refreshingly simple data-driven game engine built in Rust
https://bevyengine.org
Apache License 2.0
36.35k stars 3.59k forks source link

Specify policy on panics in the Bevy repo. #15979

Open tbillington opened 1 month ago

tbillington commented 1 month ago

What problem does this solve or what need does it fill?

Bevy currently does not have a policy or documentation on the usage of panics in Rust code.

Because unhandled errors (panics) in Bevy result in the termination of the application, they are very harmful to the user experience. A panic in Rust signals the application has entered into an unrecoverable error state, and application termination is the only viable course of action. This is reinforced in the Rust book.

The panic! macro signals that your program is in a state it can’t handle and lets you tell the process to stop instead of trying to proceed with invalid or incorrect values.

Panicing is especially sensitive in a library like Bevy since it takes away choice from the user about how to handle a particular situation. To Bevys credit, it often does give the user a choice, such as Query::get_single which returns a Result, as opposed to Query::single, which is good to see.

Unhandled error fallout

Since Bevy is based upon writing game logic in Rust at the same "level" as core engine code, and there is no separate runtime like engines such as Unity or Godot, unhandled errors due to either user error or engine library error are much more disruptive. With runtime based engine, an unhandled error in user code or much of the engine code will result in an exception being raised, and a error being logged, but otherwise execution of the game will continue relatively uninterrupted.

That resilience and gracefully handling of errors is incredibly useful to game developers. Not only are games often produced under time, budget, and team constraints, but they may also be worked on by large teams with people of varied skill sets that may not know all of the invariants that are supposed to be kept in game logic. Rapid prototyping, iteration, and changes late in development also prevent developers from producing perfect code that handles all edge cases. At their core, games are about delivering an experience for players playing on their own devices. As such, if not handled well, errors can be especially disruptive to the player experience, and ultimately to the games reception and reputation.

A crash in a Bevy game isn't just a stacktrace on the developers machine. It results in a negative experience for players, and possibly worse reviews and refunds, materially impacting the developer.

This being the case, the use of panics in Bevy should be avoided wherever possible, heavily scrutinised, and only used as a last result.

What solution would you like?

Codify a policy on the use of panics with the Bevy engine, preferable upheld by CI lints/rules.

These rules could be tighter/looser per crate, but that's just minor details at this point.

Additional context

Motivating experience My motivation for creating this issue was a couple of recent PRs that had me concerned about the difference in expectations about the use of panics between my assumptions and current Bevy maintainer expectations. **Before continuing I want to stress that I see this as 100% an expectations and process thing, and not any kind of personal indictment!** I greatly value the contributions of everyone involved and hope this just serves to clarify expectations between people contributing to Bevy, and (at least 1) user(s) of Bevy going forward :) As part of implementing picking in Bevy, mesh picking was introduced in https://github.com/bevyengine/bevy/pull/15800. I saw in discord that more reviews were desired, and I wanted to help out by reviewing. As part of my review I encountered multiple sources of panics in the new code, which I mentioned in the review. As someone who had only done a handful of contributions/reviews to Bevy previously, I didn't want to come across too strong as I'm still learning what the Bevy code & review expectations are. While one source of panics was removed as a result, a few remained. What surprised me was that only my casual review even mentioned the panics. My assumption was that introducing panics into the engine in such user interfacing code would be more controversial. Where I became concerned that perhaps my view of the use of panics and that others was bigger than I thought was during a [follow-up issue](https://github.com/bevyengine/bevy/issues/15891) not 12 hours later, with code from one of Bevys own examples triggering a panic from the PR in a code path I hadn't even noticed or commented on. It reinforced that this new API & impl in Bevy was a very easy source of panics. To my surprise, the fix was just to correct the example, not any of the offending engine code. Of course, the incorrect example should be fixed, and was. However the inaction on the panicing code even after seeing it being hit so soon after merging was what motivated me to create this issue to better understand and clarify the use of panics in the codebase. If panics were so readily accepted in just the one PR I happened to review randomly, does that mean many more PRs introduce panics at this level I haven't seen? Of course I only have a small sample size, but that thought worried me quite a bit. I want Bevy to be a rock solid foundation for me and others to build incredible experiences on, and that includes being robust against the realities of the game development process I mentioned earlier. I want to have confidence Bevy will handle anything I throw at it, and be the last source of worries for me as a game developer. To that end, my personal expectation as a user of Bevy is that short of something catastrophic such as the user removing their graphics card mid-game, or me messing with internals via reflection, the engine should never take down the process if at all possible. Again, I have a lot of respect for everyone contributing to Bevy. I just want this issue clarified to we can all be unified going forward without needing to repeatedly discuss details based on personal opinions in every issue/PR 💖
bushrat011899 commented 1 month ago

I think this is an important piece of documentation and policy. From a no_std perspective, this is even more important, since you don't have access to catch_unwind from the std library to minimize the impact of a panicking system (for example).

I think at a minimum we should require Panics sections in the documentation of all public APIs (which is already mostly if not totally done!). I'm not sure if that's enforceable via a built-in Clippy lint, or even with the Bevy CLI. There is a no_panic crate, but it's very hard to work with in a complex codebase with many dependencies.

Perhaps we could deny the usage of public panicking functions within Bevy's crates? Or (more contentious) add a panic feature to access them?

BD103 commented 1 month ago

The Bevy Linter has a lint to deny panicking methods in Query and World. We can enforce whatever policy is decided on through this.

Atlas16A commented 1 month ago

What ever gets decided here, I would also like to add this to bevy_editor_prototypes for the sake of policy parity with bevy main.

ghost commented 3 weeks ago

I'd much prefer a panic feature option, I'm avoiding Single and likely going to patch bevy in order to regain ability to have panics for missing resources or other such related logic bugs(perhaps it's still possible and I glossed over some details). I was frankly bothered when I encountered a log message saying a system hasn't run due to a missing resource. At least some option to catch these errors on the user side is required so they can make the choice, again I may have glossed over some details.

tbillington commented 1 week ago

I'd say that's more of bevy_ecs's current default API rather than how bevy implements things internally (which was my original focus), but it is quite fundamental to how bevy works.

You raise a legitimate point that discovering missing system requirements at development time should be a well supported operation, and having a toggle that could be enabled during development to complain more loudly or even panic/debug_assert would help surface them.

I'm absolutely in favour of options like that and giving developers the choice and tooling that they need.

My greater point is that a shipped bevy game should be resilient to failures like these without forcing the behaviour of crashing. I think this will become especially important as development progresses, and you have a large codebase with many dependencies, ensuring every invariant of every dependency is perfectly upheld is not feasible ahead of time. My opinion is defaulting to panicing is unacceptable for a release/distribution build, but ultimately the game developer/user of bevy should have that choice.