bevyengine / bevy

A refreshingly simple data-driven game engine built in Rust
https://bevyengine.org
Apache License 2.0
36.01k stars 3.55k forks source link

Improve Picking Query Expressiveness #16118

Open aevyrie opened 4 days ago

aevyrie commented 4 days ago

Following the discussion started here: https://github.com/bevyengine/bevy/issues/16065#issuecomment-2438900604

@cart @NthTensor I've rephrased the discussion in my own words, as an attempt to better understand it. Please correct me if I've misrepresented something!

Context and Current State

bevy_picking has an opinion of what "picking" means. Specifically, it means coalescing all hit tests across all entities into a view of what is directly below each pointer, plus the ability for entities to allow hits to pass through to lower levels when determining which entities are hovered. Generally speaking, only the topmost entity under a pointer is considered "hovered", regardless of how many entity hits may have been reported by all backends for this pointer - unless that entity allows picks to pass through to lower entities.

This definition broadly follows the way UIs reason about picking, but applies it to all entities (UI, 3d, 2d, etc). While this covers most use cases, especially for 2D/UI, it lacks the expressiveness of something like a general purpose raycast.

The primary limitations lie in the logic used to decide:

Currently, this is done solely with the optional PickingBehavior component. The component provides two axes of control:

  1. should_block_lower: does this entity block things below it from being hovered?
  2. is_hoverable: is this entity itself hoverable; i.e will it emit and trigger events when hovered?

This behavior is only decidable by the entities themselves, not, say, based on the state of the application.

mod_picking Architecture

A brief review of the architecture of the picking plugin, for reference during discussion:

  1. Input: Input plugins spawn and update pointers based on winit events.
    • Inputs can be sourced from anything - not just winit. This allows for virtual pointers controlled by anything, such as gamepads.
  2. Backends: Picking backends look at pointer locations, and emit events to describe what entities are under the pointer. This can be every single entity, or, the backend might early exit by looking at the PickingBehavior component for performance reasons. (Existing raycasting backends do this).
    • Backends can also ignore pointers entirely, and report hits based on arbitrary inputs, like VR controllers.
  3. Focus: The focus system reads all incoming pointer hit events from backends, and sorts them for each pointer based on the hit depth, and the order (layer) usually from the Camera::order. This ensures that hits are correctly ordered to match the order that entities are rendered on screen. This data is used to build:
    • OverMap: maps pointers to all entities that they hit (as reported by backends), sorted in order.
    • HoverMap: filters the OverMap by traversing the hit entities top-to-bottom, and following the blocking/hovering logic defined in each entity's PickingBehavior component, halting as soon as a blocking entity is hit.
  4. Frontend: Every frame the HoverMap is copied to the PreviousHoverMap. The event system then looks at these two maps as the authoritative picking state to determine what events to send. If an entity was hovered in the previous frame, but is absent this frame, we know to send a Pointer<Out> event.

The general consensus is that this model and definition of picking is sufficient, and should be retained. However, it would be nice if we could reuse some of these abstractions, while extending their functionality to enable more expressive queries for what entities are under each pointer, and what events to trigger.

Performance

One of the constraints with making these queries more expressive is performance. The current raycasting backend, for example, examines the PickingBehavior component of entities as it hits them front-to-back as a significant performance optimization. If we were to naively intersect all entities, and query them later, this make raycasts exponentially more expensive.

Improvement Discussion

The point of discussion is then: how can we modify these existing tools to allow for more complex queries and interaction events, in addition to the global coalesced state defined in the HoverMap?

Quoting @cart:

My "3rd scenario" is "when people want to ask questions about what is under their mouse (and what was under their mouse) that the global picking sort behavior and individual entity configuration cannot answer (edit: or that a single configuration of that system cannot answer within a given frame) , how can we make that as efficient and ergonomic as possible, and potentially reuse infrastructure that has already been built". Some various (not necessarily related) ideas:

  1. bevy_picking is already getting fed raw (unfiltered) hits from backends via PointerHits events. This is a great design that enables us to build on top of it. When we have a "new" question to ask, rather than recompute this data again via new raycasts, we could instead consume that event feed and then filter it down based on arbitrary criteria (and perhaps filter down to only the backends or entities we want to consider).
  2. bevy_picking computes a (private / Local) Overmap from the PointerHits events. To my knowledge this contains everything necessary to compute arbitrary over/out events (including new arbitrary derived "filtered" over/out results with different criteria, in the style of the current HoverMap). We could discuss making this (or something like it) public.
  3. We could consider generalizing the PointerHits -> Overmap -> SOME_FILTERING_PROCESS -> Event Triggers system in such a way that allows someone to define some high level PointerQueryPlugin::<T: PointerQuery>::new() that would generate over/out (and perhaps down/up) event triggers based on some PointerQuery impl. Where T could be something like "filter to entities with Rarity, sort by Rarity, and select the 3 rarest entities". The user just defines the filters and selectors and the query does the remaining tracking stuff. In/out events would be fired based on what has just become one of the 3 rarest entities under the cursor (or what has just stopped being one of those entities).
  4. We could consider doing something like (3), but generalized to "arbitrary raycasts", not "cursor raycasts".
aevyrie commented 4 days ago

Off the cuff, my initial thought is that performance might be one of the major constraints here. The naive solution is to hit test everything, and let the plugin sort/filter the raw stream, and allow users to query that raw stream. This was one of the original goals of the current design! However, I found that in some cases, this is either too slow (raycasting) or impossible (shader picking). In practice, the raycasting picking backends use this knowledge to early exit and avoid reporting hits when they aren't needed for picking.

There might be a solution that builds on top of pointer inputs, like picking backends, but allows passing in queries to run when the "general raycast backend" shoots rays from the pointer. This could be a standard interface that applicable picking backends could opt into. For example, the raycasting backends might do an early exit picking raycast for perf reasons, in addition to other raycasts as described by requests to the "general pointer raycasting plugin".

aevyrie commented 4 days ago

Proposal: PointerQuery extension for backends

This would avoid many performance issues, because you only need a single ray traversal, while executing all queries for only as long as those queries are interested in reading results.

This solution would also allow any 3rd party to add support. For example, this would be be a simple addition on top of the existing mesh picking backends, and should be just as easy to add for the downstream rapier and avian physics raycasters.

Having a "retained" interface like this makes a lot of sense to me, because the raycast itself happens at a fixed point in the schedule, in order to batch all queries for the same pointer input that frame. This is as opposed to being able to immediately request and evaluate these queries at any time in the schedule.

Open Questions

  1. Is this the right interface?
  2. Does this meet the needs @cart had in mind? This is maybe more flexible than the solutions proposed, but does less for you.
  3. Do commands allow for triggering arbitrary observers with this interface?
  4. Is this still a performance footgun? Users could easily write queries that never early exit the raycast.
  5. Would it be easier and more efficient to just use RayMap to do pointer raycasts with your engine of choice?
alice-i-cecile commented 20 hours ago

https://github.com/bevyengine/bevy/issues/15287 could be used to help improve the ergonomics of picking observers, by making it very easy to filter for entities that they trigger on.