Implementing Hit Test for touchscreen, controllers and moving rays

De-Panther commented 4 years ago

I hope that this is the place to ask about this. I'm trying to implement Hit Test in a WebXR Export in Unity. (Unofficial exporter, I'm not working at Unity Tech, working on it on my spare time)

My end goal is a system that can works both with rays from a point on the display, and a ray from a point in the virtual world (mostly from a controller, but might be from a moving virtual element). Similar to Unity's AR Foundation (Unity's unified API for AR devices)

public bool Raycast(Vector2 screenPoint, List<ARRaycastHit> hitResults, TrackableType trackableTypeMask = TrackableType.All);
public bool Raycast(Ray ray, List<ARRaycastHit> hitResults, TrackableType trackableTypeMask = TrackableType.All, float pointCloudRaycastAngleInDegrees = 5f);

https://docs.unity3d.com/Packages/com.unity.xr.arfoundation@4.1/manual/index.html#ray-casting Those methods return results on the same frame.

I read the Hit Test Explainer and the Hit Test Module draft.

In WebXR, as far as I understand: The results are async. I can raycast from a point on the display, using the viewer XRSpace and offsetRay. I can raycast from a point in the virtual world, using an XRSession reference XRSpace (e.g. local-floor) and offsetRay.

What I'm not sure about:

If the developers want the users to be able to drag their fingers on a touchscreen and the hitTestSource to follow, can they use requestHitTestSourceForTransientInput which will return input sources that can be followed?
Can it also be used for the controllers in an XRSession reference XRSpace?
And what if the developers wants to get hits of a moving ray in the virtual world coordinates? (e.g. a laser beam that moves up and down and may hit a real wall)

I guess that they can use requestHitTestSourceForTransientInput or requestHitTestSource for a static source point in the session's space, but If the point moves, they'll have to request a new HitTestSource every frame.

Thanks

klausw commented 4 years ago

@bialpio , can you elaborate on this?

If the developers want the users to be able to drag their fingers on a touchscreen and the hitTestSource to follow, can they use requestHitTestSourceForTransientInput which will return input sources that can be followed?

That should work, the XRTransientInputHitTestResult has an inputSource attribute corresponding to the touching finger, with a targetRaySpace.

Can it also be used for the controllers in an XRSession reference XRSpace?

The transient input variant isn't intended to be used with controllers. For those, you'd use normal requestHitTest, using the appropriate known space (typically controller grip space) in its init args. The transient version is needed because the input source (and space) for that doesn't exist until the touch happens, so you can't supply the input source at initialization.

And what if the developers wants to get hits of a moving ray in the virtual world coordinates? (e.g. a laser beam that moves up and down and may hit a real wall)

I'm not aware of a direct way to do synchronous hit testing for an arbitrary virtual-world ray origin that's moving independently of tracked XR spaces. The existing hit test variants are based on capabilities of the underlying device APIs such as ARCore, and it's not feasible to get arbitrary data synchronously from Blink since it would require a cross-process call to the device process.

Requesting a new hit test source every frame should be fine as far as performance is concerned. If the virtual-world object is moving based on a timed animation, you could use prediction to compensate for the latency introduced by the async frame delay.

bialpio commented 4 years ago

What I'm not sure about:

If the developers want the users to be able to drag their fingers on a touchscreen and the hitTestSource to follow, can they use requestHitTestSourceForTransientInput which will return input sources that can be followed?

Correct, as long as the user is touching the screen, if the application has subscribed to a hit test for transient input source of the matching type, it will get a list of results grouped by input source that was used to compute them (so for multi-touch screens, each touch point will have its own list of results).

Can it also be used for the controllers in an XRSession reference XRSpace?

I'm not sure if I understood this part. You can subscribe to a hit test using controller's XRSpace (this works for "non-transient" input sources, controllers fall into that category) via XRSession.requestHitTestSource({space: controller.targetRaySpace}). Does that help or am I missing something?

And what if the developers wants to get hits of a moving ray in the virtual world coordinates? (e.g. a laser beam that moves up and down and may hit a real wall)

As Klaus mentioned, there's not really a good way to achieve that (creating a subscription every frame could be a workaround).

De-Panther commented 4 years ago

@klausw

Can it also be used for the controllers in an XRSession reference XRSpace?

The transient input variant isn't intended to be used with controllers. For those, you'd use normal requestHitTest, using the appropriate known space (typically controller grip space) in its init args. The transient version is needed because the input source (and space) for that doesn't exist until the touch happens, so you can't supply the input source at initialization.

But controllers are also inputSource, using requestHitTestSourceForTransientInput won't return them as an inputSource?

Requesting a new hit test source every frame should be fine as far as performance is concerned. If the virtual-world object is moving based on a timed animation, you could use prediction to compensate for the latency introduced by the async frame delay.

So the process would be something like this?

// ... get hitTestSource for the first time

// ... in rAF, check if there are hit results
if (hitTestSource) {
  let hitTestResults = xrFrame.getHitTestResults(hitTestSource);
  if (hitTestResults.length > 0) {
    // ... do something with results
    storeResultsAndSendToTheFramework();
    // cancel old hitTestSource
    hitTestSource.cancel();
    // obtain new hitTestSource ... xrSession.requestHitTestSource(hitTestOptionsInit) ...
    getNewHitTestSource();
    // ...
  }
}

@bialpio

I'm not sure if I understood this part. You can subscribe to a hit test using controller's XRSpace (this works for "non-transient" input sources, controllers fall into that category) via XRSession.requestHitTestSource({space: controller.targetRaySpace}). Does that help or am I missing something?

I think that I understand that now. I thought that controllers are also "transient" like touchscreen.

Thanks

klausw commented 4 years ago

But controllers are also inputSource, using requestHitTestSourceForTransientInput won't return them as an inputSource?

No, the WebXR API distinguishes between transient input sources that only appear temporarily while events are actively happening, such as screen touches, and regular input sources such as controllers that are longer-lived and exist even when not actively creating XR input events.

A motion controller wouldn't be considered transient. Even though regular controllers may be added or removed during a session, i.e. as part of hardware being connected or disconnected, that doesn't make them transient inputs according to the spec definition for that.

See https://immersive-web.github.io/webxr/#transient-input for more detail.

So the process would be something like this?

At first glance, this looks plausible. You'll want to do the cancel/getNewHitTestSource outside the hitTestResults.length > 0 conditional, otherwise the ray source stops moving if it fails to hit real world geometry.

De-Panther commented 4 years ago

No, the WebXR API distinguishes between transient input sources that only appear temporarily while events are actively happening, such as screen touches, and regular input sources such as controllers that are longer-lived and exist even when not actively creating XR input events.

A motion controller wouldn't be considered transient. Even though regular controllers may be added or removed during a session, i.e. as part of hardware being connected or disconnected, that doesn't make them transient inputs according to the spec definition for that.

See https://immersive-web.github.io/webxr/#transient-input for more detail.

Thanks, got it.

At first glance, this looks plausible. You'll want to do the cancel/getNewHitTestSource outside the hitTestResults.length > 0 conditional, otherwise the ray source stops moving if it fails to hit real world geometry.

Would xrFrame.getHitTestResults(hitTestSource) return value when it didn't get result? If I ask for getHitTestResults at the same frame that I asked for the hitTestSource, I might not get a result during the session. Or maybe because I need to requestHitTestSource, it can be something like:

requestHitTestSource

... xrFrame ...
no hitTestSource ... still obtaining it
... xrFrame end ...

requestHitTestSource returned hitTestSource

... xrFrame ...
use hitTestSource ... there's also result in getHitTestResults
cancel old hitTestSource
requestHitTestSource again ...

klausw commented 4 years ago

The spec just says that getHitTestResults returns an array of XRHitTestResults, and only throws errors if the it test source is not present in the map. An empty array of results would be valid, and you should expect to get an empty array back. The compute hit test results for transient input algorithm includes the initialization Let hitTestResults be an empty list, similarly for transient input.

@bialpio , assuming I'm not misunderstanding it, would it be worth clarifying in the spec or a note that the returned result array may be empty?

De-Panther commented 4 years ago

I thought so, this is why the if (hitTestResults.length > 0). Just wasn't sure if it's even possible to get more than zero results on the first frame that we have hitTestSource.

klausw commented 4 years ago

The if (hitTestResults.length > 0) check is appropriate for processing results, though in practice you can often get the same effect by just iterating over the results, since that also does nothing if there are no results.

However, if you want a moving hit test source where you update the ray origin on every frame, you'd want to do that unconditionally, including if you got zero results. The snippet from your earlier code didn't do that as far as I can tell, so the hit test source would stop getting updated if at any point the hit test didn't hit any real-world geometry.

De-Panther commented 4 years ago

True. But is it even possible to get a result on the first frame that we got an XRHitTestSource?

klausw commented 4 years ago

But is it even possible to get a result on the first frame that we got an XRHitTestSource?

If I'm reading the spec right, that's not possible. The requestHitTestSource algorithm says Add compute all hit test results algorithm to session’s list of frame updates if it is not already present there, and WebXR's run-animation-frame algorithm does the apply frame updates step before running animation frame callbacks. So a compliant implementation wouldn't produce results on the same frame that an XRHitTestSource was added for the first time.

De-Panther commented 4 years ago

Thanks :) That's not ideal, but I have some ideas for workarounds.

bialpio commented 4 years ago

But is it even possible to get a result on the first frame that we got an XRHitTestSource?

If I'm reading the spec right, that's not possible. The requestHitTestSource algorithm says Add compute all hit test results algorithm to session’s list of frame updates if it is not already present there, and WebXR's run-animation-frame algorithm does the apply frame updates step before running animation frame callbacks. So a compliant implementation wouldn't produce results on the same frame that an XRHitTestSource was added for the first time.

Depends on what you mean by "first frame". Creating a hit test source returns a promise - as soon as that promise gets resolved, you can assume that the results may be available in a subsequent requestAnimationFrame callback (but, as already noted, they may be empty). The thing is, if you have requested a hit test source within rAFcb for frame_1, the promise you got can only get resolved after the rAF callbacks for that frame have all been executed, so at the earliest, you will be able to query for hit test results in rAFcb for frame_2 using the hit test source.

@klausw, seems like we agree that it is not possible to ask for a hit test source and use it in the same rAF callback, but each of us has a different reason for saying that. :) To me it suggests there's a gap in spec text that could be made more precise so that it's more clear what the intent was - does my explanation above make sense? If so, I'll add a note to make the implications of some algorithm steps more clear.

De-Panther commented 4 years ago

Depends on what you mean by "first frame". Creating a hit test source returns a promise - as soon as that promise gets resolved, you can assume that the results may be available in a subsequent requestAnimationFrame callback (but, as already noted, they may be empty). The thing is, if you have requested a hit test source within rAFcb for frame_1, the promise you got can only get resolved after the rAF callbacks for that frame have all been executed, so at the earliest, you will be able to query for hit test results in rAFcb for frame_2 using the hit test source.

Now I'm confused again 😅

Frame 1 asked for requestHitTestSource. In frame 2 the promise returned and I have HitTestSource. Is it even possible that there's at least one result on frame 2, or should I just skip the check on frame 2 and check on frame 3?

I understand that it's possible that even frames 3 and above would return 0 results. But is it possible that frame 2 would return one and not 0?

Thanks

bialpio commented 4 years ago

Depends on what you mean by "first frame". Creating a hit test source returns a promise - as soon as that promise gets resolved, you can assume that the results may be available in a subsequent requestAnimationFrame callback (but, as already noted, they may be empty). The thing is, if you have requested a hit test source within rAFcb for frame_1, the promise you got can only get resolved after the rAF callbacks for that frame have all been executed, so at the earliest, you will be able to query for hit test results in rAFcb for frame_2 using the hit test source.

Now I'm confused again

Frame 1 asked for requestHitTestSource. In frame 2 the promise returned and I have HitTestSource. Is it even possible that there's at least one result on frame 2, or should I just skip the check on frame 2 and check on frame 3?

Note that the promise does not return "in frame 2". The more detailed sequence of steps in your case would be:

Frame 1 asked for requestHitTestSource. Frame 1's request animation frame callbacks finish running.
The promise returned and you now have the hitTestSource - this is happening outside of request animation frame callbacks, so I would not describe it as happening in any particular frame.
Frame 2 started - you should now be able to query the results using hitTestSource, and that may already be a non-empty array.

(the above is still slightly simplified, for example, it may take longer for the promise to return)

I understand that it's possible that even frames 3 and above would return 0 results. But is it possible that frame 2 would return one and not 0?

Thanks

Yes, frame 2 can already have non-empty results, depending on the timings. As soon as you got a hit test source, you should be able to get a non-zero results during request animation frame callbacks. If that's never the case, there may be some issue with the implementation we have.

De-Panther commented 4 years ago

Great. So I'll rely on that for now. Thanks again

immersive-web / hit-test

Implementing Hit Test for touchscreen, controllers and moving rays #93