Closed cabanier closed 3 years ago
IMO, rather than the NOTE we should modify the algorithm, adding a new step 8/10 along the lines of:
foreach |joint| in |jointspaces| :
if the user agent cannot determine the |pose| of |joint| return false and stop processing these steps.
The last step becomes simply "return true", and the current 8.3/10.3 step can be removed.
IMO, rather than the NOTE we should modify the algorithm, adding a new step 8/10 along the lines of:
foreach |joint| in |jointspaces| : if the user agent cannot determine the |pose| of |joint| return false and stop processing these steps.
The last step becomes simply "return true", and the current 8.3/10.3 step can be removed.
Unfortunately, the joints don't all have to belong to the same hand for fillJointRadii
For fillPoses
they don't even have to be joint spaces.
I see, so the joints in fillJointRadii/fillJointPoses aren't necessarily all of the joints for a hand or all from the same hand; I think it's still better to put an explicit check in the algorithm to call out something along the lines of validating that all other poses from the hand that |joint| belongs to are valid before providing the data (e.g. as an addition to determining if a pose can be provided for the joint); especially since I believe NOTEs like what you've added are non-normative. (Unless that is your intention).
I see, so the joints in fillJointRadii/fillJointPoses aren't necessarily all of the joints for a hand or all from the same hand; I think it's still better to put an explicit check in the algorithm to call out something along the lines of validating that all other poses from the hand that |joint| belongs to are valid before providing the data (e.g. as an addition to determining if a pose can be provided for the joint); especially since I believe NOTEs like what you've added are non-normative. (Unless that is your intention).
A normative section higher up defines this:
When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.
So the spec already defines this behavior; it just wasn't obvious.
That's fair, my feedback is mainly based on how much back/forth I had to do through the spec to understand what was happening here when I was attempting to review it this morning to understand the behavior. (e.g. We have one line about it, but then don't reference this requirement in any algorithms about querying or updating the data). Even then, I didn't think that fillJointPoses/fillJointRadii was either incomplete or referring to joints from different hands.
IMO, the getJointPose algoritms should be redefined to clearly state, that null should be returned if pose of any joint on the same hand cannot be determined. Then, the two other algorithms should be redefined in terms of getJointPose
- Notes should not introduce normative requirements. We should update the algorithms to do exactly what we want.
There's normative text that states that if one joint has no pose, none of the joints of that hand have a pose. The note is just to clarify that this is implicitly used by the algorithm.
- For fillJointRadii - if user can request radii for a subset of joints, should we consider all other (not requested) joints from that hand when deciding whether return NaN or not?
Yes. The normative text defines this:
When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.
This should be clarified in the algorithm. The same for fillPoses - should we consider all spaces of a hand (even those that are not requested by caller), when deciding whether to return 16 NaNs or valid transforms?
Yes but again, implied by the normative text. If we add sorting by hand to the algorithm, it will become much more complicated.
Having read the Editor's Draft version I see it is clearer than the FPWD. Still, I would propose that instead of the Notes we add a specific algorithm step. Something like this perhaps: https://github.com/immersive-web/webxr-hand-input/commit/50f63061e561b450b934602a905cb56f67137e1c
I wanted to comment on these two bits:
A normative section higher up defines this:
When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.
So the spec already defines this behavior; it just wasn't obvious.
If we add sorting by hand to the algorithm, it will become much more complicated.
I don't think the adding the note here makes it obvious, and I actually strongly disagree that it makes the algorithm more complicated, to me the fact that this was only implicit and not an explicit step added to a lot of confusion when trying to parse the text, as someone not intimately familiar with this spec. I don't think we necessarily need to sort by hand, but joints should have a reference back to their hand that we can use to determine if all other joints on that hand can be located. Essentially, I think we need some explicit algorithm for:
There's normative text that states that if one joint has no pose, none of the joints of that hand have a pose.
Perhaps as a compromise we could add/update a hand update algorithm that sets all poses to null if any pose on the hand is null, that way even if it was theoretically able to locate the pose the UA should have "nulled out" that pose for the frame.
We may even want to strengthen the text to be a bit more explicit and change:
When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.
To:
When a hand is partially obscured the user agent MUST either emulate the obscured joints, or report null poses for all of the joints.
So the spec already defines this behavior; it just wasn't obvious. If we add sorting by hand to the algorithm, it will become much more complicated.
I don't think the adding the note here makes it obvious, and I actually strongly disagree that it makes the algorithm more complicated, to me the fact that this was only implicit and not an explicit step added to a lot of confusion when trying to parse the text, as someone not intimately familiar with this spec. I don't think we necessarily need to sort by hand, but joints should have a reference back to their hand that we can use to determine if all other joints on that hand can be located.
They have a reference back to their hand:
Every XRJointSpace has an associated hand, which is the XRHand that created it.
Essentially, I think we need some explicit algorithm for:
There's normative text that states that if one joint has no pose, none of the joints of that hand have a pose.
hmm, I think we need to patch populate the pose to do this. I'll make an attempt.
We may even want to strengthen the text to be a bit more explicit and change:
When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.
To:
When a hand is partially obscured the user agent MUST either emulate the obscured joints, or report null poses for all of the joints.
That sounds good to me. The 2 "may"'s are indeed confusing
to me the fact that this was only implicit and not an explicit step added to a lot of confusion when trying to parse the text, as someone not intimately familiar with this spec
+1! I must agree. The algorithms in spec should be painfully explicit. That's what they are for. Changing their behaviour with some context in the prose of the spec that is not immediately available adds confusion and will cause interoperability problems.
@alcooper91 @jmajnert I hoisted the populate the pose
algorithm into the spec and patched it.
I'm not 100% happy with it but I can't see any other way since the regular WebXR methods also need to account for this joint behavior.
Or to modify the WebXR spec to accept extensions? Essentially all you've done is added a line in the middle of the algorithm, so perhaps the main algorithm could be patched with something like:
I think that would be ideal. We did something similar for Layers. Maybe we can submit the current work but file an issue against WebXR at the same time. I believe @toji and @Manishearth are working on getting to CR so we can't make changes right now. Maybe we should discuss if we should fork the spec to level 2 so we can still make changes.
That's fair, and I'll defer to @toji and @Manishearth on that.
Would something like this be possible as a workaround to not have to copy the full text? Or I guess that technically leaves a hole for getting the poses in the reference space...
I definitely agree that it's not ideal. Would it be possible to do something like:
Algorithm: Populate Pose for XRHand 1. Let |pose| be the result of running |Populate the pose| for space in base space 2. If |space| is a {{XRJointSpace}}....
Would something like this be possible as a workaround to not have to copy the full text? Or I guess that technically leaves a hole for getting the poses in the reference space...
You mean, only patch it for the methods in the hands spec? If so, then it wouldn't apply for the getPose function the WebXR spec.
I see, then from my perspective this is fine, and helps clarify it; but I think there's no explicit modification made to fillJointRadii, which doesn't look to reference the "Populate the Pose" algorithm?
Sorry to be cutting in, but have you seen my proposition here: https://github.com/immersive-web/webxr-hand-input/commit/50f63061e561b450b934602a905cb56f67137e1c ?
I think it cheaply (without changes to other specs) solves our problems.
I see, then from my perspective this is fine, and helps clarify it; but I think there's no explicit modification made to fillJointRadii, which doesn't look to reference the "Populate the Pose" algorithm?
I will make it call that algorithm.
I see, then from my perspective this is fine, and helps clarify it; but I think there's no explicit modification made to fillJointRadii, which doesn't look to reference the "Populate the Pose" algorithm?
I will make it call that algorithm.
Unfortunately, I couldn't use that so I had to patch it in place. @jmajnert, I reformatted your proposal a bit. Let me know if you prefer your approach.
I don't understand why we need to patch the algorithm here; this hook already exists in the form of the definition for native origin: each space defines how its native origin behaves. That's what currently exists in the spec, with the wording of:
The native origin of the XRJointSpace may only be reported when native origins of all other XRJointSpaces on the same hand are being reported. When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.
except that MAY should probably be a "MUST either" as you've fixed in this PR. We could tighten up the wording to stop saying "reported" here and define the native origin as this only-when-everyone-else-exists quantity.
I think a way of clarifying might be to add a note on the populate the pose algorithm in the core spec that native origins may have conditions in which they are or aren't shown and to check their definition.
I don't understand why we need to patch the algorithm here; this hook already exists in the form of the definition for native origin: each space defines how its native origin behaves. That's what currently exists in the spec, with the wording of:
The native origin of the XRJointSpace may only be reported when native origins of all other XRJointSpaces on the same hand are being reported. When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.
I agree as well. Patching the whole algorithm is ugly and IMO just as confusing. I will take it out of the PR. The notes should clarify and we should probably also add some example code.
The reason why we have this PR is because reviewers of a change on Chromium gerrit found the specification to be self contradicting: https://chromium-review.googlesource.com/c/chromium/src/+/3025151/comments/bfb1865f_07c0e6db Let me quote:
1) https://www.w3.org/TR/webxr-hand-input-1/#xrjointspace-interface States that if not all XRJointPoses can be identified/emulated, then all need to be reported as null. 2) https://www.w3.org/TR/webxr-hand-input-1/#dom-xrframe-filljointradii and https://immersive-web.github.io/webxr-hand-input/#dom-xrframe-fillposes both contradict this, by points 8.3/10.3 respectively, which both should probably read something like "If the pose can't be found, zero out any pre-filled data and abort" (Or adding a pre-cursor step to check for that data and abort).
Again: they read the requirement that "all or nothing" is reported and found it to contradict the algorithms. They did not take the requirement as being an amendment to the algorithms. And that's how people will read it IMHO.
I too had a hard time understanding what the specification really wanted. I thought that the whole reason of introducing algorithms in W3C specifications was to clearly and unambiguously define the expected behavior, without the need to process all of the prose of the various parts of the document and glue them together.
That being said, I'm not an expert here. If you think that the PR as it stands clarifies the expected behavior, then so be it.
I too had a hard time understanding what the specification really wanted. I thought that the whole reason of introducing algorithms in W3C specifications was to clearly and unambiguously define the expected behavior, without the need to process all of the prose of the various parts of the document and glue them together.
No, you need to take all the prose into account when implementing a specification. For instance, in this case the decision boils to the prose in WebXR that states "Query the XR device's tracking system". The normative text in WebXR Hands defines that this should fail if any of the joints can't be tracked or emulated.
It's true that going three steps deep isn't always obvious which is why we add informative notes. Examples will also help.
For instance, in this case the decision boils to the prose in WebXR that states "Query the XR device's tracking system". The normative text in WebXR Hands defines that this should fail if any of the joints can't be tracked or emulated.
This is correct.
Closes #91
Preview | Diff