Specify policy on panics in the Bevy repo.

tbillington commented 3 days ago

What problem does this solve or what need does it fill?

Bevy currently does not have a policy or documentation on the usage of panics in Rust code.

Because unhandled errors (panics) in Bevy result in the termination of the application, they are very harmful to the user experience. A panic in Rust signals the application has entered into an unrecoverable error state, and application termination is the only viable course of action. This is reinforced in the Rust book.

The panic! macro signals that your program is in a state it can’t handle and lets you tell the process to stop instead of trying to proceed with invalid or incorrect values.

Panicing is especially sensitive in a library like Bevy since it takes away choice from the user about how to handle a particular situation. To Bevys credit, it often does give the user a choice, such as Query::get_single which returns a Result, as opposed to Query::single, which is good to see.

Unhandled error fallout

Since Bevy is based upon writing game logic in Rust at the same "level" as core engine code, and there is no separate runtime like engines such as Unity or Godot, unhandled errors due to either user error or engine library error are much more disruptive. With runtime based engine, an unhandled error in user code or much of the engine code will result in an exception being raised, and a error being logged, but otherwise execution of the game will continue relatively uninterrupted.

That resilience and gracefully handling of errors is incredibly useful to game developers. Not only are games often produced under time, budget, and team constraints, but they may also be worked on by large teams with people of varied skill sets that may not know all of the invariants that are supposed to be kept in game logic. Rapid prototyping, iteration, and changes late in development also prevent developers from producing perfect code that handles all edge cases. At their core, games are about delivering an experience for players playing on their own devices. As such, if not handled well, errors can be especially disruptive to the player experience, and ultimately to the games reception and reputation.

A crash in a Bevy game isn't just a stacktrace on the developers machine. It results in a negative experience for players, and possibly worse reviews and refunds, materially impacting the developer.

This being the case, the use of panics in Bevy should be avoided wherever possible, heavily scrutinised, and only used as a last result.

What solution would you like?

Codify a policy on the use of panics with the Bevy engine, preferable upheld by CI lints/rules.

These rules could be tighter/looser per crate, but that's just minor details at this point.

Additional context

Motivating experience

My motivation for creating this issue was a couple of recent PRs that had me concerned about the difference in expectations about the use of panics between my assumptions and current Bevy maintainer expectations. **Before continuing I want to stress that I see this as 100% an expectations and process thing, and not any kind of personal indictment!** I greatly value the contributions of everyone involved and hope this just serves to clarify expectations between people contributing to Bevy, and (at least 1) user(s) of Bevy going forward :) As part of implementing picking in Bevy, mesh picking was introduced in https://github.com/bevyengine/bevy/pull/15800. I saw in discord that more reviews were desired, and I wanted to help out by reviewing. As part of my review I encountered multiple sources of panics in the new code, which I mentioned in the review. As someone who had only done a handful of contributions/reviews to Bevy previously, I didn't want to come across too strong as I'm still learning what the Bevy code & review expectations are. While one source of panics was removed as a result, a few remained. What surprised me was that only my casual review even mentioned the panics. My assumption was that introducing panics into the engine in such user interfacing code would be more controversial. Where I became concerned that perhaps my view of the use of panics and that others was bigger than I thought was during a [follow-up issue](https://github.com/bevyengine/bevy/issues/15891) not 12 hours later, with code from one of Bevys own examples triggering a panic from the PR in a code path I hadn't even noticed or commented on. It reinforced that this new API & impl in Bevy was a very easy source of panics. To my surprise, the fix was just to correct the example, not any of the offending engine code. Of course, the incorrect example should be fixed, and was. However the inaction on the panicing code even after seeing it being hit so soon after merging was what motivated me to create this issue to better understand and clarify the use of panics in the codebase. If panics were so readily accepted in just the one PR I happened to review randomly, does that mean many more PRs introduce panics at this level I haven't seen? Of course I only have a small sample size, but that thought worried me quite a bit. I want Bevy to be a rock solid foundation for me and others to build incredible experiences on, and that includes being robust against the realities of the game development process I mentioned earlier. I want to have confidence Bevy will handle anything I throw at it, and be the last source of worries for me as a game developer. To that end, my personal expectation as a user of Bevy is that short of something catastrophic such as the user removing their graphics card mid-game, or me messing with internals via reflection, the engine should never take down the process if at all possible. Again, I have a lot of respect for everyone contributing to Bevy. I just want this issue clarified to we can all be unified going forward without needing to repeatedly discuss details based on personal opinions in every issue/PR 💖

bushrat011899 commented 3 days ago

I think this is an important piece of documentation and policy. From a no_std perspective, this is even more important, since you don't have access to catch_unwind from the std library to minimize the impact of a panicking system (for example).

I think at a minimum we should require Panics sections in the documentation of all public APIs (which is already mostly if not totally done!). I'm not sure if that's enforceable via a built-in Clippy lint, or even with the Bevy CLI. There is a no_panic crate, but it's very hard to work with in a complex codebase with many dependencies.

Perhaps we could deny the usage of public panicking functions within Bevy's crates? Or (more contentious) add a panic feature to access them?

BD103 commented 3 days ago

The Bevy Linter has a lint to deny panicking methods in Query and World. We can enforce whatever policy is decided on through this.

Atlas16A commented 3 days ago

What ever gets decided here, I would also like to add this to bevy_editor_prototypes for the sake of policy parity with bevy main.

bevyengine / bevy