immersive-web / webxr

Repository for the WebXR Device API Specification.
https://immersive-web.github.io/webxr/
Other
2.99k stars 386 forks source link

XRFrame attributes for an inactive frame #1207

Open bialpio opened 3 years ago

bialpio commented 3 years ago

During implementation of one of extension modules, we've run into a situation where it seems reasonable to either throw or return a default-constructed value from an XRFrame that is no longer active (examples are XRFrame.trackedAnchors and XRFrame.detectedPlanes, along with already-existing XRFrame.session).

My read on the current behavior is that there is no special-casing of what happens when XRFrame attributes are accessed for an inactive frame, so e.g. XRFrame.session would just return an instance of XRSession irrespective of the frame's state (even when the session has already ended). The XRFrame.trackedAnchors / XRFrame.detectedPlanes are a bit more complicated, and I think we could take one of the 3 approaches here:

  1. Throw an InvalidStateError when those attributes are accessed on inactive frame. For anchors API, this would be a breaking change. Moreover, apps may not expect that accessing an attribute could throw.
  2. Return an empty setlike when those attributes are accessed on inactive frame. As with above, this would be a breaking change for anchors API, but avoids a potential footgun, described in the next paragraph.
  3. Keep returning a setlike with the same contents irrespective of whether the attributes are accessed on an inactive frame. This is what is currently specified (by omission), but exposes a potential footgun.

Ideally, approach 3 is something we should go with, but the problem with this is that the instances present in the setlike are all "live" objects that could be changing, so the applications would have no guarantee about the state of those objects when inspecting them outside of an active frame. This is less of a concern for anchors since they only contain XRSpace that is useable only during active frames, but is very relevant for planes. The only useful information the apps would be able to get would be to observe how the composition of the sets have been changing over time (e.g. they'd be able to inspect which anchor objects have been present in frame_n_minus_ten.detectedPlanes set & see if those same objects are in frame_n.detectedPlanes set). Accessing a specific attribute on the "live" objects would return the most recently available data for that object, not the historic data that the app could reasonably expect to get (hence, The Footgun).

So the specific question is: what do people think the approach to the above problem for those attributes should be? Did I miss some other viable option?

And more general question is: maybe we should start treating inactive XRFrames as defunct objects which do not return anything useful? Is there any use case that I am missing that actually takes advantage of inactive XRFrames, or is this just a side-effect of having to operate in JavaScript which allows the apps to store a reference to XRFrame in a variable that's accessible in a wider scope, outside of our callbacks? If the latter, it may actually be viable to throw on inactive XRFrame attribute access.

Maksims commented 3 years ago

Seems like there is no holy-grail solution here, and it varies on the scale of giving to devs responsibility (unsafe) or prevent them from shooting the foot (safe).

Unsafe:

XRFrame provides pointers to objects, arrays, sets, etc. Which are "live" objects. If the reference is the same across the frames, then I believe it is not a problem, that data on those objects is updated to the most recent. But it is not historical (unsafe).

As mentioned before: if the spec states clearly that XRFrame should be used only within the same tick, and not persisted for long-term use, does not restrict devs from misuse though.

Safer:

Another option would be to make XRFrame defunct as you've mentioned and nullify all the pointers to live objects. Making it clear that data is not available anymore. This leads to code branching, and sometimes not obvious logic "why it is null now?".

Safe:

Throw an exception when accessing live-object references on outdated XRFrame, it would also educate through exception message, basically: don't use an outdated XRFrame. This prevents code branching and helps devs earlier. The question is: when exactly it becomes "outdated". This also might lead to a more need to store own references to data.

All options seem fine 😄 I like an unsafe option, as it points to most recent data, and historical - in all cases needs to be done by application logic. Also, this avoids code branching and exception-catching.

cabanier commented 3 years ago

...

  1. Throw an InvalidStateError when those attributes are accessed on inactive frame. For anchors API, this would be a breaking change. Moreover, apps may not expect that accessing an attribute could throw.
  2. Return an empty setlike when those attributes are accessed on inactive frame. As with above, this would be a breaking change for anchors API, but avoids a potential footgun, described in the next paragraph.
  3. Keep returning a setlike with the same contents irrespective of whether the attributes are accessed on an inactive frame. This is what is currently specified (by omission), but exposes a potential footgun.

In retrospect, it was a mistake to have these as attributes and we should have a function instead or they should be on XRSession. Is it too late to make this change? If it's too late, I'd vote for option 3 as it's wrong to access an XRFrame (or hold onto a copy) outside of the Raf call.

Ideally, approach 3 is something we should go with, but the problem with this is that the instances present in the setlike are all "live" objects that could be changing, so the applications would have no guarantee about the state of those objects when inspecting them outside of an active frame.

It's weird the that arrays are live objects. I see that they are defined as sameObject in the spec but not in Chromium's IDL. Maybe we can fix the spec to match the implementation?

And more general question is: maybe we should start treating inactive XRFrames as defunct objects which do not return anything useful? Is there any use case that I am missing that actually takes advantage of inactive XRFrames, or is this just a side-effect of having to operate in JavaScript which allows the apps to store a reference to XRFrame in a variable that's accessible in a wider scope, outside of our callbacks? If the latter, it may actually be viable to throw on inactive XRFrame attribute access.

I think it's the latter. For instance, we already recycle XRFrame objects to avoid GC.

Manishearth commented 3 years ago

I discussed this with @toji today and he has some thoughts about this he plans to post but just to clarify one point: the reason active/valid/etc exists and why attributes usually do not error on invalidity is that you can always take the thing the attribute produces, hold on to it, and use it later.

The general pattern is that methods should error on invalidity, attributes should not error on invalidity but if you want them to be invalid when the producing object is invalid they should store a reference to the object (e.g. the XRFrame or XRSession) and have its methods check validity/activity as necessary. This is why, for example, XRSpace ensures XRSession is valid when you try to do anything with it.

cabanier commented 3 years ago

If you tear it off, you don't expect that it will change over time. Shouldn't you get a copy of the container when you access the attribute?

bialpio commented 3 years ago

It's weird the that arrays are live objects. I see that they are defined as sameObject in the spec but not in Chromium's IDL. Maybe we can fix the spec to match the implementation?

Sorry, what I meant was that the objects contained in the arrays (well, sets in this case) are live, but the containers themselves are not live.

I think it's the latter. For instance, we already recycle XRFrame objects to avoid GC.

Ouch, in that case those (detectedPlanes, trackedAnchors) attributes cannot be marked as [SameObject] in the spec. :frowning_face:

cabanier commented 3 years ago

It's weird the that arrays are live objects. I see that they are defined as sameObject in the spec but not in Chromium's IDL. Maybe we can fix the spec to match the implementation?

Sorry, what I meant was that the objects contained in the arrays (well, sets in this case) are live, but the containers themselves are not live.

I think it's the latter. For instance, we already recycle XRFrame objects to avoid GC.

Ouch, in that case those (detectedPlanes, trackedAnchors) attributes cannot be marked as [SameObject] in the spec. ☹️

I see that the change to re-use XRFrame went in with #1088 but it is not in the spec. @toji was this removed again at some point? The Oculus browser is currently reusing this object between frames.

bialpio commented 3 years ago

The XRFrames can / are recycled I believe - the spec mandates recycling by reusing animation frame that is set on an XRSession.

Alright, so given the above, I think we are forced to go with option 3: we will keep returning a collection that has the same contents for as long as it is possible (so when an XRFrame is recycled, the app will start getting latest state). It'd mean that when the attributes are queried from an inactive frame, the app will get the state that was valid as of last time that particular frame instance was active. This leaves the footgun in place, so I wouldn't mind just saying that all accesses (be it a method call or attribute access) on inactive XRFrames throw InvalidStateErrors (or at least log a diagnostic to console) to save the devs potential grief if others think it may be a good idea to do.

toji commented 3 years ago

So in looking through this with Manish earlier, a distinction that I noticed between trackedAnchors and detectedPlanes was that with anchors the developer is responsible for creating them and thus could hold on to the object for as long as they wanted regardless of what the frame does. The only thing that trackedAnchors indicates is which of those previously created anchors were "seen" that frame. To that end the trackedAnchors setlike is an accurate indication of the state of the frame, and the fact that the anchors themselves may have shifted in the meantime isn't a huge concern.

What makes detectedPlanes different, in my mind, is that it appears to be the only way that planes are surfaced. You can't have gotten a reference to a plane independent of the frame it seems (if I misread the spec please let me know!) so we don't have as much of an obligation to keep those plane objects "live". Looking at the contents of the XRPlane object we can re-use the XRSpace each frame, because there's no way to observe it's internals outside of a frame anyway, but the polygon, orientation, and lastChangedTime may be better served by simply returning a new struct every frame? Yes, the GC implications of that aren't my favorite but at least it avoids the issues being discussed here.

TL;DR: If an API object is created external to the frame then there's not much sense in pretending it's not a live object and the frame can reference it as needed. If the API object is only surfaced by the frame, though, it may be best to treat them as immutable.

I think we are forced to go with option 3: we will keep returning a collection that has the same contents for as long as it is possible.

This is still practical, even given the above. If the XRPlanes are treated as immutable then it doesn't matter if you return the same instance when the values haven't changed.

bialpio commented 3 years ago

You can't have gotten a reference to a plane independent of the frame it seems (if I misread the spec please let me know!) so we don't have as much of an obligation to keep those plane objects "live".

That's correct.

Looking at the contents of the XRPlane object we can re-use the XRSpace each frame, because there's no way to observe it's internals outside of a frame anyway, but the polygon, orientation, and lastChangedTime may be better served by simply returning a new struct every frame?

The main reason for returning "live" objects instead of immutable object every time we're being asked for it is that we'd then have to expose new attribute (id?) on it to be able to associate objects returned from frame N with objects in frame N+1. Returning a live object gives this to us for free - apps can use === to check if the objects are representing the same logical plane, & the objects can be added to sets and set membership will work as expected. It also means that similar patterns that devs apply to anchors could still mostly work (citation needed) for planes (anchors can only appear due to dev-initiated action, but can disappear at any time, just like planes).

Quick example:

// ... in rAFcb ...
let previous_planes = ...; // obtained in previous rAFcb or input source event
let some_interesing_plane = previous_planes[i];

if (frame.detectedPlanes.has(some_interesting_plane)) ... // yay, a plane we want is still present!

Compare to:

// ... in rAFcb ...
let previous_planes = ...; // obtained in previous rAFcb or input source event
let some_interesing_plane = previous_planes[i];

let current_plane_ids = new Set(frame.detectedPlanes.map(plane => plane.id)); // boilerplate?
if (current_plane_ids.has(some_interesing_plane.id)) ... // yay, a plane we want is still present!

Additionally, polygon may remain unchanged across frames, so returning immutable objects will force us to create quite a lot short-lived instances that would not be necessary otherwise.

(feel free to move this part of the discussion to plane detection repo)

In any case, I think we're fine leaving the XRFrame spec'd the way it is now in the core spec (seems like it's equivalent to option 3). I'll need to make sure the anchors and planes specs have their frame update algorithms written correctly. The things that I'd propose we consider for core spec would be:

Manishearth commented 3 years ago

/agenda to figure out what to do here

bialpio commented 3 years ago

From today's meeting: I think this issue is no longer that big of a concern given that the spec mandates that the XRFrame is recycled, and other comments in this thread. At the very least, putting my "implementer" hat on, I know how to make progress to ensure the implementation is compliant.

Putting on my "spec editor" hat: to reiterate my previous reply, I think we should consider adding a recommendation ("SHOULD") for the user agents to log a diagnostics message when a live object is accessed while a frame is inactive. We guarantee that objects are in "valid, but unspecified state" when the frame that produced them is not active, but accessing them can lead to interop risks I think.