Closed greggman closed 3 months ago
Merits of consistency aside, as you can imagine this would be a pretty substantial change.. 🙂
As a more feasible short-term step, would you be willing to have a look at #1301 (proposed KHR_property_animation
) and the use of JSON pointers there? That seems like a good opportunity to start in this direction, and feedback would be helpful.
The glTF Validator and the VSCode extension both process JSON pointers, so there's some precedent there.
That said, I think there are issues here that are being overlooked. A node's children, for example, can only be other children, but by placing "key": "node"
on each child reference, a couple of problems emerge:
We've opened the door to create invalid glTF files since we've placed a multiple-choice field in a spot where there can only ever be a single correct answer.
People learning glTF by hand-reading samples may be grateful to know that each child of a node is itself specified as being a node, but inevitably some will wonder what other values they could place there. They may infer that children can be something other than child nodes.
People hand-editing glTF, either for educational purposes or for more fine-tuned control than automated exporters can provide, will be tempted to edit this field, and can only ever get it wrong.
Basically we've made life a little easier for "generic" readers that are trying at a high level to parse glTF without detailed knowledge of the glTF schema, but at the expense of creating dozens of new fields that only have a single correct answer.
In practice (for example, in the glTF reference viewer, and Cesium's implementation, among others), we don't see a lot of huge switch statements around this glTF ID. To stick with the node/child example here, the code that looks through the node's child list is only looking for child nodes, it doesn't need a switch statement for anything else. In fact if there were a ref
parameter per child, then a new switch statement might get introduced, calculating what to do if any child wasn't itself a node.
I do have a piece of code that would be simplified by this change, the onDefinition
provider (F12 handler) in VSCode, has a bunch of code that does exactly what you describe, figuring out where it is in the glTF file in order to understand where the definition is for that ID reference. But even here, if the type of reference were handed to me by the file, I would still need to check if what the file said it was referencing was really allowed to be referenced from the place that referenced it. I wouldn't want to blindly follow invalid references. So I would still need all this code just to check if the thing being asked for was allowed to be asked for from the place that asked for it.
And to be clear, my VSCode example is an outlier, because it's a text editor trying to generically navigate definitions in a document. Real glTF viewing code doesn't have this problem at all, because there is no centralized location that will process these references. In Cesium for example, the code that handles nodes is already separate from the code that handles materials, etc. So each section of code is just hard-wired to process references to the correct thing for that section. Hence there are no large switch or if/else blocks surrounding this.
I wrote too much here, but I think the current reference system is fine for what it is.
Merits of consistency aside, as you can imagine this would be a pretty substantial change..
I agree it would be a substantial change and I know people won't like to hear it but I think glTF should not be encumbered by backward compatibility until it gets things further along. I don't think it should stick to backward compatibility just for the sake of compatibility. It's missing lots of features needed to display a full machinima style scene which I think should be the goal. A goal like that brings to light what's missing. I'd much rather see that goal be the thing to judge by, not that some previous decision prevents that goal in the name of "it's a substantial change".
There are plenty of warts in the current spec. Fixing those warts will required non-backward compatible changes (assuming we can come to consensus they are warts). Using quaternions for rotation is one such wart IMO as it prevents various kinds of animated scenes from being reproduced.
So I think generic pointers should be judged not on backward compatibility but one whether or not they'll be better in the long run.
Personally I think they would because an entire part of all future additions no longer has to be specced. Right now if you want one object to point to another you have to spec where the other object is and how to find it. So for example root.nodes[n].children
points to root.nodes[index]
and root.scene.nodes
also points to root.nodes[index]
. root.animations[n].channels[n].sampler
points to root.animations[n].animations[index]
etc.
It's arguably a mess.
All that goes away with a generic pointer system. What they reference is documented directly in the format. Now it's willy nilly having some objects reference things in the root, some reference things by inconsistent english puralization, some by local not by root but some arbitrary piece of some other object. And each time new data is added some other yet arbitrary way to get to the referenced object has to be specced.
By using a generic pointer system it gets much clearer, all objects that need a reference go in arrays at the root, references then point to those root objects. No need to decide on a per new feature basic which new random way the data will be added, at least in this one area.
Using quaternions for rotation is one such wart IMO
fwiw I don't consider the quaternions a wart 😉
Although staying on topic here, one thing that's kinda frustrated me was the use of json dictionary objects and their incompatibility with some serializers (I'm looking at you Unity <.< lol)
How much of a change are we talking about, does it work with Unity's Json serializer, and will that mean I need to spend another few weeks learning the new paradigm to and re-implement it in my gltf builder?
idk how this works in javascript, but it's fairly easy to resolve references in my c# implementation with property accessors.
Once my gltf json has been deserialized I loop through the structure and assign those references to the easily accessible properties. (it's even smart enough to get the reference if for some reason this pass doesn't assign a valid one).
Once the references are setup construction is fairly straight forward.
Using quaternions for rotation is one such wart IMO
fwiw I don't consider the quaternions a wart 😉
Unity considers it a wart. Their animation system defaults to Eulers and as pointed out in the other thread they needed to support Eulers to actually handle animations users were created in external packages. It's certainly a wart that (a) Eulers are not supported since data required to show off an artists intent is lost and (b) if you want to support both Euler (needed to get all animation data) and Quats (because of preferences) then it's a wart that there's no type info on the rotation nor the animation channels telling you what kind of interpolation you need and instead it's inferred.
Once my gltf json has been deserialized I loop through the structure and assign those references to the easily accessible properties.
There are several and growing special cases. Every time a new reference is added you'll need to add code to your referencing fixing function. Why make more work for yourself? I don't get the argument at all.
As far as I can tell the only arguments being put forth for the way it is are (a) we already started that way and (b) people already wrote code. Those seem like invalid arguments to me. The format is far from finished. It seems better to fix the problems now or it will be full of technical debt.
time a new reference is added you'll need to add code to your referencing fixing function.
Nope, it's pretty good about knowing how to handle extra references in the final object. Because once it's in an editable state, I can pretty much do whatever I need to it.
Once I'm about to export that object is when I wouldn't want my references to change, much like when you do any build.
Their animation system defaults to Eulers
Pretty sure unity uses quaternions under the hood, like we discussed in the other thread. In either case, let's try to stay on topic.
... only arguments being put forth for the way it is are (a) we already started that way and (b) people already wrote code...
While I agree those arguments aren't exactly the best, they aren't invalid either.
I agree we should definitely work on getting to a position where the spec is really good, even if it means making a few breaking changes, but we taking still consider those arguments with some grain of salt. People put a lot of time and effort into making what we have is the best it can be. Please respect that.
So would Unity's serializer with with this generic reference system you proposed?
As far as I can tell the only arguments being put forth for the way it is are (a) we already started that way and (b) people already wrote code.
I've not explained myself well, or my point was missed.
Adding reference types to the existing format actually damages the format, because in each case, there is only a single type that is valid in any given situation. 100% of the new information being added to a given file is either hard-coded in the schema, or invalid. You've made life easier for programmers with one particular use-case, while making it much harder on content authors.
@greggman Your extensive expertise is most welcome here, and several of your suggestions in other issues are good ones. But this particular one is a non-starter.
Unity considers [quaternions] a wart. ....
Replied at https://github.com/KhronosGroup/glTF/issues/1515#issuecomment-451691790 to keep things organized.
Adding reference types to the existing format actually damages the format, because in each case, there is only a single type that is valid in any given situation. 100% of the new information being added to a given file is either hard-coded in the schema, or invalid. You've made life easier for programmers with one particular use-case, while making it much harder on content authors.
@greggman Your extensive expertise is most welcome here, and several of your suggestions in other issues are good ones. But this particular one is a non-starter.
I don't understand this POV. I sounds like you're effectively saying all generic serialization systems, the one built into C#, the one built into Unity, the ones I'm guessing built into Java/Swift/Python/Kotlin etc are broken. They all do what I'm suggesting which is they auto-handle references. They'd all have the same issue that you can build a file who's references are wrong yet they still all do it.
As for being harder for content authors what's harder? It seems easier to me as there's less to decide, less to document. I'm not even sure what you mean by "content authors" though. Is that programmers reading the files? That's certainly not harder. Programmers writing the files? No harder. Artists using an exporter? Not harder. Manually writing a file? Ok, that one is harder but arguably no one should be manually writing glTF files any more than they manually write C# serialization files.
Thanks @greggman — I've tagged this issue for consideration in a future release (3.X+). It is not a change we can consider in the 2.X lifecycle, given the breaking changes involved. A 3.X release of glTF is still some time away: I would not expect that to happen this year.
In the meantime, we would welcome further feedback from you or anyone implementing glTF input/output, or examples of formats that have done this successfully. Per @emackey's comments there is still some question on the impact of this change, but we are open to the idea.
Completely on @emackey 's side here - allowing general references to anything everywhere might simplify the deserializer a bit but invites all kinds of unwanted side effects, corner cases, and plain misuse of the feature. So my vote goes to "no. Please no.".
@greggman , your comparison with other serialization systems doesn't really apply because in contrast to JSON most of those other systems are type safe. You can't simply link to an entry in an array of let's say meshes when you expect a node. Your proposed reference system doesn't know about types and you could for example hide an object of one type in an array of other things as long as it kinda duck types to what you expect. Going full JSON Schema 6.0 with the references would mitigate part of that, yeah, but (correct me if I'm wrong) one nice thing about glTF is that you don't need to do so if you just want to load a file. Trying to deserialize an invalid file without validation 'works' in terms of that it'll probably go wrong at one point but that's usually not worse than failing validation. All security critical fields (buffers, etc) are in the data values and binary buffers anyway, and no JSON schema keeps you from getting these wrong.
PS: As it's friday evening and nobody looks at what I'm working on anyway, I looked at my own C# glTF importer and quickly fake-implemented references just to see what the difference is. Turns out, I'd probably save 20 lines of code. Out of >3000. I wouldn't say it's worth it.
@greggman For what it's worth, this did come up on the glTF Working Group call this week, and not everyone was as opposed as me. But the larger roadblock is that the glTF 2.0 schema has been (mostly) locked down since it was released in June 2017, the only exceptions being minor description field updates and clarifications. This is due to its inclusion in various enterprise-scale software (including Windows, Office, Facebook Newsfeed) and these large orgs (whom I don't speak for) generally can't tolerate breaking schema changes without some major version number bump. So, feature additions to 2.0 come exclusively in the form of glTF extensions, which are allowed in the 2.0 schema.
No one on the call wanted to talk about glTF 3.0 anytime this year or next. Our focus is on getting the 2.0 ecosystem smoothed out, and on completing extensions for any features that missed the chance to be included in the original 2.0 release.
Changing the way references work is not something that can be done within the current extension framework, and as such it's unfortunately not even on the table as a possibility within the next year or two at least.
The glTF Object Model implements the requested feature. Generic references to glTF properties are now possible using JSON pointers, such as /nodes/0/translation
to control the local position of glTF node index 0. Note that only the JSON pointers specified in the glTF Object Model or by other extensions are supported.
The KHR_animation_pointer extension allows us to use these properties in glTF animations, and the work-in-progress KHR_interactivity extension allows for using properties in scripts.
I see there is some discussion above about potential difficulty in serializing and deserializing references to glTF properties. In case it helps understand how to implement these things, here's a link to my Godot Engine implementation of KHR_animation_pointer: https://github.com/godotengine/godot/pull/94165
Using quaternions for rotation is one such wart IMO
Note that with Euler angles, you need to specify a rotation order (usually YXZ, but could be others). Quaternions are inherently more engine-agnostic due to this, because there is only one way to interpret the numbers in a Quaternion.
Anyway, this issue can be closed as resolved. The glTF Object Model is not exactly what this issue asked for, but it solves the same use case, except for changing existing glTF references to be this way (which was discussed as not desired).
I've noticed in order to find references lots of custom code is needed. for example a node references a mesh or a skin by custom property. Not only do you need to know that
mesh
andskin
exist, you have to also know thatmesh
referencesroot.meshes
andskin
referencesroot.skins
It seems like it would be beneficial to use a more generic referencing system such that a loader, with zero knowledge of the content of the file, can fix up all the references.
As one example imagine all references used the keyword
!ref
so instead ofit was something line
Now a generic loader and just read the entire tree of the JSON and anytime it sees and object with
!ref
it can automatically resolve the referenceThis would also inform how other things are serialized. If they can be shared they belong in some array at the root which IIRC is how most generic serialzer/deserializers work.
I realize that's a slightly larger format but we're talking about formats where the bulk of the data is really in textures and then geometry and animations. The format above is unlikely to increase space more than a single % and with zlib compression most of those are close enough to each other they'll get compressed into oblivion
The benefit is no knowledge is needed of how to connect a specific reference into its corresponding table. It also suggests that if you want references in some extension you need to add your data in an array at the root which is arguably the right thing to do as it allows sharing references, something that's missing from certain extensions that shouldn't be missing.
As it as as more and more stuff gets added the process of resolving references gets more and more convoluted when taking a cue from serialization libraries this can all be solved in a generic way. That means less code and less knowledge that has to be understood before using.
You could make the refs have paths as in
{"!ref": "animations/2/samplers/3"}
but I'd argue like most serialization formats objects should just all just be at the root and then the hierarchy of objects is in the references so in other words the fact that animations have samplers and channels those should move to root arrays of animsamplers and animchannels or some such thing.I might even suggest that if you can make the ref key optional if you standardize the collection names. As it is right now collection names are following arbitrary "English" plualization rules. "mesh" -> "meshes" but "skin" in "skins". You could instead make a more formal rule. The simplest would to not pluralize in the first place. The next might be just append 's'. Let's assume we didn't pluralize so the array of meshes is just called "mesh". Then, if a reference doesn't have a key use it's property name
Only minor changes are needed to the resolver