KhronosGroup / glTF

glTF – Runtime 3D Asset Delivery
Other
7.11k stars 1.13k forks source link

Need to improve `cubicspline` interpolation for rotations #2008

Closed jbherdman closed 2 years ago

jbherdman commented 3 years ago

In the current spec, Appendix C indicates that cubicspline interpolation of quaternions is performed as if the quaternions were simply 4D vectors, followed by a renormalize of the resulting 4D-vector so that it becomes a unit quaternion. This interpretation is further clarified via the closed issue #1386.

This decision on interpolation logic is a problem, because it does not match the de facto industry standard of sqlerp() as introduced by Shoemake in 1985. (Note that squad(), another de facto standard introduced by Shoemake in 1987, is basically just sqlerp() with specific rules about how to auto-compute "smooth" interior-tangents between quaternion keyframes. We can side-step that added complexity by continuing to rely on explicit in-/out-tangents written to the glTF file.)

Looking at how quaternion interpolation is currently specified for glTF, the closest example that I can find is Blender's system. This has been termed "RQBez" ("Renormalized quaternion Bezier curve") in literature. However, it should be noted that the math from Appendix C does not fully replicate Blender's runtime system. There are corner-cases that exist in the Blender code which are not shared with the glTF interpolation (flipping to "shortest arc" while interpolating, for example). That is entirely reasonable, but it means that even "Blender quaternion animation" cannot be directly exported to the glTF interpolation model.

The better solution would be to adopt the de facto standard of sqlerp(). This would allow most existing 'upstream' programs to output their native animation directly (other than upstream programs that rely on Euler angles for rotation animation). This leads to smaller file sizes, as rotation-animation will generally not need to be 'resampled' for export. It also avoids the 'RQBez' error-case when the interpolated 4D-vector passes too close to (0,0,0,0).

The main problem, of course, is that this ship has already sailed. In theory, the current cubicspline behaviour for rotation is likely fixed in stone. In practice, I would point out that very few public projects appear to properly support the existing cubicspline interpolation. Presumably, though, the most prudent course of action would be to add a new interpolation-type, rather than changing the existing cubicspline logic.

lexaknyazev commented 3 years ago

Technically, schemas are more fixed than the appendix; to add a new enum, we'd need to define an extension. However, given that

very few public projects appear to properly support the existing cubicspline interpolation

it may be possible to somehow expand the existing language around splines.

@bghgary could you please look at #2009?

bghgary commented 3 years ago

This has to be added as an extension as @lexaknyazev says since adding it to the existing spec would create incompatibilities. For some minimal context, we originally got this normalized Hermite Quaternion spline interpolation from Unity. I agree it's not straight-forward for an exporter to export into this form if the form doesn't already match, which is likely in many cases. I don't think this is a problem by itself though. glTF is trying to be the last mile format and tries to decrease the burden at runtime which may increase the burden on export. I would expect the Quaternion Slerp interpolation to be more expensive at runtime compared to the existing way. There are trade-offs between transfer size, runtime cost, accuracy, numeric issues, etc.

As for the PR, we need the new enum and new text to be in an extension. We can review once this is done.

donmccurdy commented 3 years ago

very few public projects appear to properly support the existing cubicspline interpolation

Do you mean that few authoring tools export animation with cubicspline interpolation? The test cases we have for this interpolation type do appear to be well supported in viewers and engines.

As for authoring tool adoption, there are additional reasons (e.g. IK constraints) that DCC tools like Blender usually bake animation. I'm not confident that adding a new interpolation method will eliminate the need to bake animations, or at least I'd be interested to hear from authors of those DCC exporters. If DCC tools won't in practice adopt this, it may be something to consider in a future version of glTF rather than as an addition to the v2 lifecycle.

jbherdman commented 3 years ago

In response to @bghgary:

I have been trying to find any reference to Unity using the same Hermite Quaternion interpolation, but I can't seem to find anything. The main references I can find seems to imply that Unity uses the usual slerp/sqlerp interpolation as the rest of the industry.

I completely understand that glTF is a last-mile format, but the trade-offs are not clear-cut in this case. The cost of Quaternion Slerp/SQlerp is unlikely to make much of a measurable difference at runtime. What is likely to make a larger difference is that current exporters mostly just give up and output "key per frame" linear animation, rather than trying to deal with the current glTF cubicspline behaviour. This leads to larger file sizes (potentially generating 10x or more keyframes), and then the runtime is paying for 'slerp()' anyway for the linear keys, bypassing the 'RQBez' interpolation entirely. (Note that 'sqlerp()' is basically just 3x 'slerp()' calls chained together, in terms of runtime cost.)

In my experience, attempting to re-encode or compress rotation animation is largely a losing game. It is much better to take the original data whenever you can (or as close as you can), as the artist has control over its correctness. When you have skeletal animation, for example, the small errors from keyframe-reduction/curve-fitting tend to accumulate down the hierarchy. Historically, this leads to needing "other fixes" at runtime, such as using IK to ensure proper foot/hand positioning. (Or, at export time, flattening the hierarchy and baking all transforms/animation down to world-space, to minimize hierarchy-related issues.)

Depending on how one might choose to do curve-fitting to reduce the 'RQBez' curves, the rotation-error introduced is likely higher-than-expected in a lot of cases. There is a non-linear relationship between the component-curve values and the actual rotation.

My core argument, basically, is that in this case something being "less burden on the exporter" does not necessarily make it "more burden on the runtime." In a perfect world, I think glTF should have originally chosen 'sqlerp' as its interpolation here, as it would likely have been a win-win for all involved.

That said, I can respect the point that backwards-compatibility to the existing spec may be the primary consideration for any changes. I will look at rewriting my proposal as an extension.

jbherdman commented 3 years ago

very few public projects appear to properly support the existing cubicspline interpolation

Do you mean that few authoring tools export animation with cubicspline interpolation? The test cases we have for this interpolation type do appear to be well supported in viewers and engines.

Sorry, could you point me in the direction of those test cases? I had looked around a while back, but hadn't found much.

I wrote some code to generate a handful of simple test files, and from what I saw only BabylonJS and 'glTF-Sample-Viewer' were properly following the expected behaviour for cubicspline rotations in all cases. Non-rotation cubicspline seemed to be well-supported in other viewers; it was just the rotation-cases that were problematic.

I will go find some of my files that demonstrate the problems I was seeing, and post them here. (Not all my generated test files demonstrated problems, so I need to verify which ones did.)

As for authoring tool adoption, there are additional reasons (e.g. IK constraints) that DCC tools like Blender usually bake animation. I'm not confident that adding a new interpolation method will eliminate the need to bake animations, or at least I'd be interested to hear from authors of those DCC exporters. If DCC tools won't in practice adopt this, it may be something to consider in a future version of glTF rather than as an addition to the v2 lifecycle.

Speaking as someone who has been tasked with writing a commercial-grade DCC-to-glTF exporter, my proposed 'sqlerp()' interpolation is the best "lowest common denominator" available. A typical DCC exporter would just need to detect "unsupported" elements in the scene (such as IK, etc), and fall back to baking animation "as needed" rather than "always". Some DCC programs are not going to benefit, because their original animation is in Euler angles (or because of other core limitations).

The program my data is coming from optionally supports "baking" unsupported animations down to Bezier curves (rather than Linear samples), but we rely on the industry-standard 'sqlerp()' interpolation. Actually, we rely on 'squad()', for the sake of FBX + 3DSMAX compatibility, but we can write out those "implicit" tangent-values easily enough for glTF.

My own personal opinion was to just forget about cubicspline and only write Linear keys for rotation-animation out to glTF, but my boss asked me to pursue the issue, so here we are.

jbherdman commented 3 years ago

Side question, while we're here: Is there some known/good code that an exporter can use to convert a series of "rotation samples" into a relatively compact set of rotation-keyframes for glTF usage under the existing cubicspline ("RQBez") interpolation?

jbherdman commented 3 years ago

rotation_test02.zip

Here is an extremely simple example file, which fails under several glTF viewers. (As stated, 'BabylonJS' + 'glTF-Sample-Viewer' seem to properly interpret this file, but other viewers do not.)

For debug/tracing purposes, there is "extras" data on each anim-sampler node, outlining what keys were generated for each channel. This is purely for human-readability of the animation data.

There are two objects that share the same rotation-animation. The "top" object ("Box.Orig") has a few simple quaternion/cubicspline keys, and the "lower" object ("Box.Sampled") has the same animation but explicitly evaluated according to the current glTF spec rules, and output as linear samples per "frame" (at 30fps, if memory serves).

jbherdman commented 2 years ago

I have repackaged my proposed change as an extension.

However, I am not really familiar with "JSON Schema" definitions, so perhaps if everything else is in order, somebody can help with that aspect?

bghgary commented 2 years ago

The extension looks okay to me. In addition to the schema, we need multiple clients to implement it (since it's marked as EXT) and then ideally do some analysis on the differences in behavior (perf, transfer size, etc.).

jbherdman commented 2 years ago

Sorry for the delay on my end. I have finally reached a point where my software has full import/export support for my proposed EXT_animation_sqlerp extension. After some extensive local testing, it seems like the extension is not going to meet my company's goals.

The problem is primarily with our own software, which as I mentioned is based on squad() (essentially sqlerp() using specific auto-computed tangents). This restriction means that our software is essentially trying to curve-fit using only "smooth" Bezier tangents rather than using custom/explicit tangents. The results are predictably less than optimal. Unfortunately our hands are currently tied, in terms of expanding our internal representation/logic away from squad().

Even with those self-imposed restrictions, the squad()-based solution can reduce the number of keyframes output by 10%-20% for large animations (vs linear/slerp interpolation). The problem is that Bezier keys take 3x the storage space of Linear keys, so our file sizes actually increase if we output using the extension (compared to just baking and optimizing Linear keys).

We could arrive at a more compact file size by allowing glTF viewers to auto-compute the squad() tangents, but that would be far too much burden on downstream glTF loaders/viewers.

I'm going to close this issue and related PR, as I no longer have direct need for this feature, but the overall approach is feasible if anyone needs it in the future.

Thanks for the help and feedback, everyone.