Open thetuvix opened 4 years ago
What I'm worried about in this proposal is that today, hit test results based on feature points that we get from ARCore are fairly jittery, so in this particular case, we have one additional characteristic which seems to be the quality of the results. This means that if the user navigates to a site that leverages "mesh"
hit test and it works fine with one device and then switches to use Chrome on Android, the experience might be dramatically different (for the worse).
I think we can for now remove the "point"
trackable type and add it back if there is an explicit need for it. The whole idea behind the XRHitTestTrackableType
was to allow us to change the API in a non-breaking way, and to have a well-defined default behavior - if we switched to a boolean, I feel that we might not be able to change the behavior without breaking the consumers of the API.
Makes sense to me. From what you're describing, "mesh"
hit-tests on ARCore are strictly superior to "point"
hit-tests, and so the real choice for developers is between "plane"
and "mesh"
. That aligns well with how we'd implement this on HoloLens - if "point"
existed, we'd probably just map it directly to "mesh"
for site compatibility, which could then break sites that test first on HoloLens if they get lower-quality "point hit-tests" than expected on ARCore.
(...) "mesh" hit-tests on ARCore are strictly superior to "point" hit-tests (...)
Small clarification - in my experiments, "plane" is the one that is superior to "point" on ARCore. As far as I know, ARCore does not support meshes. I think "point"-based hit test might be superior to "plane"-based one when it comes to time-to-first-result, so there might be use cases where points are useful, but I have not experimented with them too extensively.
Regarding ARCore "meshes", I'd been thinking of this: https://developers.googleblog.com/2019/12/blending-realities-with-arcore-depth-api.html
The GIFs towards the bottom of that page show content being placed on or interacting with the non-planar geometrical shape of real-world objects such as trees. Specifically, the right two animations in this GIF show apps performing non-planar hit tests against complex real-world surfaces, returning a point and normal that changes each frame.
If a web developer wants to do the same thing in WebXR, they likely aren't thinking about whether the UA's underlying "ARCore Depth API" is based on "feature points", "depth points", "depth maps", "meshing", "LIDAR", etc. to power those non-planar hit-tests - instead, they just know they want to hit-test against complex non-planar surfaces rather than logical planes.
The WebXR hit-test API we ship here should enable a UA to evolve how they best fulfill "planar hit-test requests" and "non-planar hit-test requests", without web developers having to update their code from "point"
to "mesh"
to "depth-map"
to "next-tech"
over time.
Perhaps the two hit-test flavors could be "plane"
and "surface"
, with "surface"
generically representing a non-planar hit-test based on any present or future tech?
+1 for having only two modes for the hit test results
(...) without web developers having to update their code from "point" to "mesh" to "depth-map" to "next-tech" over time.
This is the main reason I wanted the developers to be explicit (within reason) about the tech they are using - if the "next tech"
is significantly different than what the application was tested with, the UAs are now introducing a new behavior that is potentially breaking applications, without having a way for them to opt-out from the "new tech"
that they have not tested with. The way I see it, we have to expose something that is an implementation detail (the tech that the UA is using to perform hit test) up to the web applications. Otherwise, we have to either stick to the tech we used at launch, or assume that the new tech is always strictly better than the old tech, but even that does not guarantee that the apps did not rely on some specific behavior that is characteristic to some old tech.
As for the specific example of the ARCore's depth API example, I'd assume that separate "depth-map"
entity type is not needed as this falls under "mesh"
- the hit test behavior in this case should not be that distinguishable from the one where the UA is using some mesh representation of the world, and a decent (?) mesh can be synthesized from its depth map.
In my head, I'm categorizing the techs into 3 different categories, based on the amount of information it has and therefore affecting the quality of hit test results:
Maybe this categorization does not make sense, but it's the main reason why I was thinking about those 3 entity types. I also do not think that all 3 entity types have to be supported by each UA (as in: all UAs need to recognize the enum values, but it is OK if the UA always returns empty results if asked to perform hit test against entity type which is not supported by the device it's using).
Perhaps the two hit-test flavors could be "plane" and "surface", with "surface" generically representing a non-planar hit-test based on any present or future tech?
I still think that including points into any of those 2 categories is not the right choice, so this proposal would mean that it is not possible to perform hit test against those entities. I think we both agree that we should have "plane" and "surface" (although I prefer the name "mesh", but we can bikeshed that later :) ).
I think there's a disconnect between the way you and I (at least) are thinking about this. My view is that a goal of this proposal is to abstract away the tech and present a solution that will work now and in the future, with "new tech".
So, I would advocate for thinking of this as "structured" vs "unstructured".
We should absolutely NOT be exposing things like "depthmap" vs "mesh". If applications need to do things that are nuanced specifically to work with depth maps or meshes, these things should be exposed via new APIs (like the world geometry API).
Hit test should be more like the "select" and "grasp" APIs, that can be implemented differently on different platforms.
Perhaps we add more structured types in the future, but all the structured ones are optional.
This has nothing to do with "testing". Right now, web pages from 20 years ago might or might not work and look like the authors intended, but they do "something".
Applications created with WebXR should have a chance of doing something reasonable on any XR device with plausibly similar features; perhaps the hit testing doesn't perform exactly as hit testing on the original platform, and perhaps it doesn't perform like the native apps on that platform.
But by sticking with more general ideas, and not getting bogged down in current platform variations, I think we'll have a simpler API that will make more sense across different platforms.
I don't think we can fully abstract planes and meshes concepts to the Web API given that there are supposed to be exposed as part of the real world geometry work.
Repeating a bit what @bialpio said above but as a reminder, the reason why we have the three types came from whether hit tests should include points or not. In smartphone AR (Android or iOS), point clouds are being created quickly but they are imprecise. Some in the team think that we shouldn't offer hit tests on them but some think that we should because getting a plane can take a while depending on how AR-friendly your environment is. It seems that the best solution is to let the website decide whether they prefer quick and imprecise results or slower but precise. Because we will most likely expose planes to the web and I would assume that the plane object may also be exposed to the hit test result, it felt natural to simply offer to filter out the results prior to the call so the UA doesn't have to make any decision. We could have a property that describes this compromise but unless we have very strong reason to believe that this is a very short term solution, I would rather expose the underlying platform concepts. It's also worth noting that the impact of picking a "fast but imprecise" solution may vary with time and technology so exposing the concepts will create consistency.
Our expectations is that websites will either filter with ['points', 'planes']
or ['planes']
. We do not expect websites to filter only with ['points']
and developer documentation should discourage that.
The reason why we added meshes
is that planes
and meshes
aren't exactly the same and especially for HMD, it's my understanding that the world is represented as a mesh. Chrome doesn't understand meshes so we would ignore this value for now. We would be happy to have only points
and planes
but I don't think it would cover all the options across devices.
I think the distinction we are making between points
and planes
is the core to why we can't merge points
and meshes
unless we consider mesh representation to be a quick and imprecise representation.
I'll repeat what I said above:
fast
, stable
, stationary
, etc.We should pick things that are not unique to phones vs hmd's, and that could reasonably be implemented across all devices and UAs in a rational way.
I can imaging that some apps might want to somehow differentiate between "wait till we have a nice stable understanding of the space and let the user pick against that" vs "let them pick anything in the world quickly" ... perhaps. But we should leave the UAs with latitude to implement whatever concepts we pick in an optimal way for their platform.
So, I'm 100% against "points and planes" and "meshes" for that matter.
I'm not sure that we should never expose anything of the sort but maybe we can mark the entityTypes
part of the spec as at risk and link to this issue? In other words, we can drop this part of the spec and give us more time to discuss what's the ideal solution should be. Chrome would be okay to ship without this and have a default behaviour that can't be configured by the website (initially?). I think having this out of the critical path will allow us to think of the different options without time pressure. WDYT?
I would be happier if we avoided config for now, so that would be good.
I think it is super important that this and other specs err on the side of cross-platform independence. Having too many specific platform-oriented things means (a) we'll start having content that only runs on a specific platform or subset of platforms and (b) we'll open the door to more fingerprinting of users.
The later is a big deal, esp because the security folks on the browser teams (and W3C in general) will lose their minds and block these proposals if we do this.
/agenda need to revisit this, especially as ARKit has again expanded the range of possibilities and this is too restrictive.
@AdaRoseCannon somehow the bot didn't catch that^
At a high level, this hit-test module is meant to enable sites to raycast against real-world geometry without dealing with the divergences around specific tracking technologies that may leak through in the full real-world geometry module.
It makes sense that sites would still want to reason about hit-testing against planes vs. hit-testing against meshes, since they have different characteristics:
"plane"
hit tests will provide a more stable normal for placing a medium-sized object"mesh"
hit tests will provide a more locally-accurate normal for placing a small object, or when you know your users will be placing the object on an uneven surfaceHowever, the
"mesh"
use case seems to apply equally to"point"
hit-tests - for both, the app is choosing to hit-test against the full contoured surface of the object rather than an idealized plane. In addition, while planes and meshes are real-world concepts, feature points are an implementation details of some of today's tracking technologies. Will some future LIDAR-based headset need to simulate feature points to keep today's sites happy?Especially for this explicitly-abstracted hit-test API, it would seem that we can get the full developer capability here (deciding on idealized planes vs. full contoured surfaces) without tying sites to today's specific tracking technologies by just subsuming feature-point hit-testing into the
"mesh"
XRHitTestTrackableType
.If we feel that
"mesh"
itself would be too specific on devices that use feature points but don't calculate mesh, we could just replace theentityTypes
array with aplanesOnly
bool. When true, the UA only intersects planes - when false, the UA can also intersect feature points, meshes, or any other way the device has for reasoning about full contoured surfaces.