KhronosGroup / glTF

glTF – Runtime 3D Asset Delivery
Other
7.2k stars 1.14k forks source link

Consider removing support for multiple scenes in a single asset #1542

Open donmccurdy opened 5 years ago

donmccurdy commented 5 years ago

Although GLTF technically allows multiple scenes, it isn't a common intention, and frankly I've never seen it used that way. In every known case, the file will contain a single scene — or perhaps none at all, e.g. for a material library — and loaders can consider that scene object a "top-level node" for application use.

The fact that glTF allows more than one scene seems to confuse or surprise users, who generally think of "scene formats" and "model formats", and multi-scene formats aren't a meaningful concept. Unless there's a strong need for the feature, I think we could consider disallowing use of multiple scenes in some future version of the spec, and/or discouraging it now.

lexaknyazev commented 5 years ago

Keep in mind that we have IBL extension that is defined on scene.

stevenvergenz commented 5 years ago

For our use case, we treat a scene loaded from a glTF as equivalent to a Unity prefab. So being able to pack multiple variations of a prefab that use the same assets is useful. Admittedly, multi-scene export isn't common, but that might change.

vpenades commented 5 years ago

Actually, my intention was to use multiple scenes support as a cheap way of having LOD levels, where each scene is a LOD

donmccurdy commented 5 years ago

Actually, my intention was to use multiple scenes support as a cheap way of having LOD levels, where each scene is a LOD...

I'd be fully supportive of having an official LOD extension at some point. Microsoft uses one internally, and there's some discussion of bringing that in: https://github.com/KhronosGroup/glTF/issues/1045.

zellski commented 5 years ago

I’m on the fence about this. It’s worthwhile to reduce complexity, and it’s meaningful to look at what people are actually doing — I agree in the vast majority of cases, the concept of “scenes” is more bewildering than helpful.

And yet the concept of being able to embed distinct scene graph roots in one GLB that can take advantage of glTF’s powerful reference mechanic to minimise duplication is really powerful — especially, as we’ve seen here, in conjunction with other extensions.

Some use cases, like LOD, can almost certainly work better at node or mesh levels. But nodes are extremely general (and already bound up in complexity such as skeletons). A simple extension that is only meaningful at the root is forced to target nodes, because we’ve removed scenes, would need to somewhat awkwardly communicate this constraints in English in the extension specification, and the constraint can’t be expressed in a schema.

Perhaps this is too artificial a concern; maybe no great amount of such use cases will ever materialise. But if we think they will, would we consider simply renaming the field? Scene is semantically confusing when the intended use has nothing to do with scenes.

lexaknyazev commented 5 years ago

But if we think they will, would we consider simply renaming the field? Scene is semantically confusing when the intended use has nothing to do with scenes.

Maybe layers?

vpenades commented 5 years ago

the point is that, multiple scenes allow for different use cases:

About complexity, I don't see it adds a lot of complexity, if you want a single scene, you'll always have the scenes collection with just one element, and engines expecting just one scene can rely on the defaultScene.

snagy commented 5 years ago

the point is that, multiple scenes allow for different use cases:

  • They can represent LODs
  • They can represent skin variations (as in, choose your player's color)
  • They can represent layers
  • they can even represent different context, like having on scene for rendering and another for physics mesh

About complexity, I don't see it adds a lot of complexity, if you want a single scene, you'll always have the scenes collection with just one element, and engines expecting just one scene can rely on the defaultScene.

The flexibility of what different scenes can represent is exactly why it shouldn't be in the core spec; all of the things listed here should be explicitly described with extensions.

donmccurdy commented 5 years ago

^This is my concern, too. When a scene could mean anything, that makes it hard to have consistent implementations. What if there are no scenes at all, just nodes? (https://github.com/donmccurdy/three-gltf-viewer/issues/132) LODs and variants are features that probably should be in the glTF spec or extensions at some point; in fact I think LODs are pretty important. On (3) and (4) I don't have an opinion but if they are needed, I would prefer to see an extension proposed so that the implementation is fully specified.

On the other hand, there isn't anything to do here in the 2.X lifecycle. I don't mind waiting it out and seeing how things look down the road. If multiple scenes can be used in different ways but aren't causing complexity/inconsistency that might be fine.

vpenades commented 5 years ago

@snagy In terms of compatibility towards displaying a gltf model.... what's the difference between having additional scenes where its meaning is only known by the producer, or having custom extensions that only the producer knows?

To some degree, gltf is designed so if a vendor/developer requires a certaing feature, it can do so with extensions, most probably breaking compatibility with the rest of the world along the way. If a vendor chooses to use additional scenes for whatever purposes sees fit, I don't see why it's different than choosing extensions.

I think the difference here is whether the gltf file is going to be used for interchange, where compatibility is a must, or for internal consumption, where you can do whatever you want with the format.

@donmccurdy I agree with you, LODs are important enough to be part of the main spec.

donmccurdy commented 5 years ago

To some degree, gltf is designed so if a vendor/developer requires a certaing feature, it can do so with extensions, most probably breaking compatibility with the rest of the world along the way...

Extensions are explicitly designed so that they can be ignored by tools that don't understand them, and the extension author has some control over fallback behavior. For example, a well-designed LODs extension might fall back to showing the highest LOD when the extension isn't recognized, whereas a Layers extension might fall back to showing all layers. That control isn't possible when using a generic feature like Scenes for multiple distinct purposes.

snagy commented 5 years ago

Echoing what @donmccurdy said, a custom extension tells everyone who doesn't understand it that they can safely ignore that data.

The spec doesn't really define what engines should do with multiple scenes, which I think is a good argument for removing support for multiple scenes. If we don't have a good reason why they should exist, remove them.

zellski commented 5 years ago

I agree with what @donmccurdy and @snagy say here, but again, this is an issue of nomenclature more than expressive power. Yes, a 'scene' sounds like it could be anything. However, the concept of a "scene graph root" among several, that still seems like the kind of expressively powerful concept that these extensions you mention would wish to lean on in order to be as schematically explicit as possible.

A model variant extension, for example, may well wish to reference what is currently known as scene, specifically to limit the scope of their ambition to "You can switch out entire scene graphs with this extension, but you certainly can't expect to switch out any sub-graph rooted in any node whatsoever."

Perhaps this is an edge case, but I still think it's unfair to say that scenes should really be extensions, when the extensions we'd want to build could make such good use of the functionality that scenes provide.

I still think we should rename them.

rainclaws commented 5 years ago

When a LOD is defined at a scene level, it would mean every property of the scene should be able to change based on the LOD level, in a predefined way. That includes camera and such. Can anyone really argue in good faith all of the properties should be(or even just can be) able to be have predefined rules that fit every use case?

I can see camera being defined as the point LOD would apply, or a change in angle after a zoom level, and so on. These would be meaning attached at the runtime level however. A lot of different definitions can make sense for a specific program, but not consistently-so for all kinds of different things.

Same goes for skins and animations. An animation that is updated less frequently, and fewer details for the skin both make sense, but the way these apply and the point at which they apply would be different, decided by the user of the glTF.

It seems to me, glTF can't decide between being a container format and a pseudo-level format.

If it is a container format, different player colors could be different materials which the software itself would attach meaning to. If it is a level format, there would be no end to this. The use cases are often similar, but engines do things in vastly different ways.

Instead of a LOD extension, perhaps a children extension can be defined? I.e. a specific form of children node that can't further have more children. Then one can use a custom property like this to define the lod level for a mesh, the player color for a material, and so on.

donmccurdy commented 5 years ago

@rainclaws there is no LOD extension yet. If you feel that LODs at the scene level would be a bad idea (which is a fair concern) feel free to comment on https://github.com/KhronosGroup/glTF/issues/1045.

vpenades commented 5 years ago

I've been giving a some thoughts about scenes, and I found at least ONE case in which they might probe useful, although I really doubt anybody will ever use it this way:

Say you have 10 glTF models, each with just one scene. When an application loads it, it will most surely create independent hardware resources for each model/scene. So when you need to render all these models, the graphics engine needs to switch resources/contexts/buffers all the time.

Now, if we could batch the 10 scenes of the 10 glTF models into a single glTF file (with 10 scenes inside) , a good tool could optimize the scenes so all of them could share the same resources, for example, all meshes sharing the same vertex and index buffer, so all the scenes could be rendered without switching index/vertex buffers.

Or for example, having multiple scenes with subtle differences, where 90% of the resources are shared. For example, scene 0: a pristine spaceship, and scene 1: a half destroyed spaceship. In this case, the second scene could reuse resources also used by scene 0.

An even more extreme case: lets say you have meshes for LEGO pieces, you could have multiple scenes reusing the same mesh pieces to build different lego models with very small memory overhead on every scene.

So, I believe multiple scenes in a glTF is all about resource sharing. Another matter is if toolsets will ever take advantage of resource sharing.

donmccurdy commented 5 years ago

Related: Batched 3D (glTF) models.

Another matter is if toolsets will ever take advantage of resource sharing.

This, at least, is happening already. In three.js, for example, scenes that reuse the same meshes or materials will not duplicate the geometry or textures in GPU memory. I haven't seen that used with multiple glTF scenes, specifically, but the feature is important regardless for multiple meshes in a single glTF file.

zellski commented 5 years ago

An even more extreme case: lets say you have meshes for LEGO pieces, you could have multiple scenes reusing the same mesh pieces to build different lego models with very small memory overhead on every scene.

This isn't hypothetical, precisely this scenario has existed for at least two years - I've seen at least one model made from hundreds of LEGO bricks that made dramatic use of mesh sharing. (It was less awesome that it needed hundreds of draw calls.)

In general, resource sharing is utterly critical, and I think the reality that tools have not yet quite tackled it is just a matter of (human) resource allocation. Many of our current plans (especially in the context of asset variations, https://github.com/KhronosGroup/glTF/issues/1569) revolve around an assumption that multiple references to a certain glTF object turn into multiple references in the engine, and that it's handled intelligently GPU- and CPU-wise.

donmccurdy commented 5 years ago

I think the reality that tools have not yet quite tackled it is just a matter of (human) resource allocation

Agreed. More specifically, I'd say that client implementations have already implemented it, but there are relatively few tools for optimizing an existing glTF asset for resource allocation. The few I'm aware of are:

Related – the Blender addon will likely support multiple scenes in exports and imports: https://github.com/KhronosGroup/glTF-Blender-IO/issues/619.

spiraloid commented 4 years ago

I work with blend files that have multiple scenes in them. on export, I expect only the active scene to export. having the other scenes partially export is gumming up the works for me. if it just exported the active scene and it's objects, I could then write some python to cycle through the other scenes and export if need be.

aaronfranke commented 1 year ago

I'd like to make a radical suggestion that takes this proposal a step further: Consider removing support for multiple root nodes per scene, in addition to multiple scenes per file.

Problem

I do not have a full list of how implementations treat root nodes, but let me use Unity as an example. In Unity when you import a 3D model file (like glTF, FBX, etc), it is treated similarly to a Unity prefab. When you instantiate that in the scene, it is represented by one parent GameObject (having one parent is not just a Unity thing: in all engines, if you want to be able to do things like set the transform of an entire prefab / glTF scene at once, you need one root node). This GameObject would have child nodes for the glTF root nodes. So, this results in a few problems:

  1. The glTF root nodes are not really root nodes of the scene, they are children of the real root node that Unity generates to represent the scene, breaking the meaning of "scene root node".
  2. The glTF file does not have a way to place data on this real root node. This is important for:
    • Physics: It is expected that the root node of a physics object defines the motion (ex: Rigidbody component), so that all child nodes move together with the physics object.
    • Abstraction: You may want to expose properties that can be changed per instance in the inspector on the root node, so that editing child nodes is not required. But if the best you can do is place properties on the glTF root (child of the real root), then those properties must be edited on children of the scene root.
    • While you could specify an extension on an entry in the "scenes" array, this would be really weird to do since it would duplicate the data and logic already present on nodes.
  3. If an implementation were to detect a single glTF root node and import that as the real root node, that would result in an unexpected change in document structure.
    • If you take a scene with one root node and add a second root node, the first root node would move down a level, which would likely be an undesired inconsistency.
    • If you have a scene where the single glTF root node has transform properties, the behavior would change when setting the scene's transform. If the scene was imported with the glTF root as a child of the real root, the glTF root's transform will be kept applied together as a child transform. If the glTF root was the real root, the glTF root's transforms would be overwritten when setting the transform.
    • Either of these problems is not acceptable, so it's not appropriate for an importer to automatically import a glTF root node as the real root node so long as the glTF spec permits multiple root nodes.

In my opinion, multiple root nodes in a scene is a feature better described by the glXF standard, which combines together multiple glTF files into a single scene. A glXF file would map to a Unity scene instead of a Unity prefab. It seems that the concept of "multiple root nodes" in glTF provides us multiple downsides with few upsides.

The above uses Unity as an example but the same thing applies to multiple engines. In (almost?) every case you want a single root node, so that you can alter the transform of the entire glTF scene "prefab" at once. Since glTF is a format focused on last mile delivery and prioritizes compatibility with importers over exporters,^1 it makes sense to me to require all glTF files only contain one root node and have this be used as the real root node, so that glTF files can specify information on the real root node. While this may (very slightly!) complicate exporters from software that does work with multiple root nodes, the fact that importers require a single root node for vital trivial operations like setting the position of an instanced scene tells me it's the right way to go.


Solution

So, my proposal would be this: Remove the concept of scenes entirely. Have there be only one root node per glTF file, define it as the node at index 0. There will be only one scene per file, with one root node, the node at index 0.[^3]

Interestingly, we could implement this proposal in glTF without breaking compatibility by requiring all (ex: glTF 2.1) assets contain "scene": 0, "scenes": [{ "nodes": [0] }] for compatibility with existing glTF 2.0 importers, but this requirement could be removed in the future during a compatibility breakage (ex: glTF 3.0).

Alternatively, instead of changing glTF, we could implement this as an extension that contains no data but imposes the restriction of a single scene with a single root node and instructs the importer to import the glTF root node as the "real" root node of the scene. This solves the problems mentioned in points 1 and 2 above, and avoids the problems that arise from doing this automatically as mentioned in point 3 above. I have written an extension for Godot that does this, GODOT_singleroot https://github.com/KhronosGroup/glTF/pull/2329. This has been implemented in Godot 4.2 and later for all the previously mentioned reasons, plus it allows us to round-trip a glTF file without generating an extra root node. If other vendors find this useful, I would love to use the `EXTorKHR_` prefixes for this.

[^3]: Except for the edge case of glTF files without any nodes, such as glTF files that are only used as mesh or material storage. In this case, there is no scene and no root node.

naomijub commented 10 months ago

I might be late for the party, but I do use multiple scenes in a similar fashion to what was described before. What I would love to have is named scenes, instead of numbered scenes