KhronosGroup / glTF

glTF – Runtime 3D Asset Delivery
Other
7.18k stars 1.14k forks source link

Rigid Body Physics layout proposal #1307

Open vpenades opened 6 years ago

vpenades commented 6 years ago

Currently, I am trying to implement my own glTF loader, and the first thing I've realize is that, as of today, things are already very complicated, specially when dealing with extensions.

I was thinking about the issue of supporting rigid body physics #1135 , and I've got a severe case of cold sweat just thinking about how much complicated the schema would become, I really don't like that prospect.

Thinking about potential solution, I've came with this idea: Instead of integrating rigid body content within the gltf schema, what about creating a completely new schema for physics? ... maybe plTF ?

So, in essence, the current glTF schema and files would remain as they are, and a new, independent plTF schema would be used to process plTF files, a model with rigid body physics would look like this:

bigcat.gltf
bigcat.pltf
bigcat.bin
bigcat.texture0.png

This has a number of advantages:

Hurdles:

Finally: by no means I am an expert in rigid body simulation, I'm mostly on the graphics side; precisely because I'm biased towards graphics, I don't want to deal with physics stuff within the glTF schema. I'm sure rigid body experts will think the same way around, and will love to have their own, clean schema.

pjcozzi commented 6 years ago

Interesting idea. I do not know much about this space, but will certainty help spread the word to get some comments here.

ghost commented 6 years ago

I know nothing at all. That said, the current BLENDER_physics extension reminds me of how Bullet will straight up load physics data out of a Collada file. That is the way. However, this new way would seem to be more efficient to load in cases where physics data and visual data don't really need to cross-reference each other. At the same time, it also reminds me of the common practice of splitting mesh data from material data or mesh data from animation data, and glTF puts it all into one file. So the question then becomes, is physics relevant to most glTF applications?

vpenades commented 6 years ago

@fluffrabbit That's the point; Physics are relevant only for very few applications, and in many cases, these applications will use physics files specific to their physics engines.

glTF has been designed over the experience of years of dealing with other existing 3D file formats such Wavefront, Collada, FBX, and countless other formats.

But we don't have such background for Physics file formats, it's unknown territory, and even if we agree on a final spec (which will take lots of updates and revisions) we might end finding that it is useless for a number of scenarios.

So in my humble opinion, get a specification right the first time it's going to be difficult, so I think it's better not to clog glTF with something so volatile.

ghost commented 6 years ago

@vpenades I'm currently engaged in correspondence with a few people about this very issue in so far as it pertains to open source games (not to snub FBX or anything). It seems that on one end of the spectrum there are more specific formats such as MD5 and IQM that are designed around ragdoll character models, on the other end there is glTF which currently lacks a physics specification, and in the middle there are Collada and X3D which specify rigid body physics in ways which are applicable to common contemporary physics engines.

From the perspective of new projects, it's nice to be able to export physics in some way, and your proposal seems to provide for that. I couldn't help but notice that the physics data doesn't get its own binary block, or wait a second...

In the case of GLB, which is appealing in that it's more homogenous and potentially useful for C/C++ users, the metadata part is binary just like that bigger data part. With this proposal, I guess the object collision data just goes in the JSON/PLB? Honestly, I'd suggest the same thing for all the visual geometry as well. Why split it between JSON and binary? That just complicates things. But now that glTF has been twice standardized, why treat physics data differently?

I guess you're implying that physics data is less precise, which is not always true. Maybe in the future the GPUs stay as they are and we're only pushing 300k tris with normal maps while the physics data is 10x more precise and processed on the physics processing unit. Extreme edge case, but not impossible.

In any case, I like any and all physics proposals because I think glTF could benefit.

vpenades commented 6 years ago

@fluffrabbit If we would have a robust physics specification, I would agree physics should fit within the glTF schema, but it is not the case, and I don't expect it to be for a long, long time.

I couldn't help but notice that the physics data doesn't get its own binary block, or wait a second...

It does:

I can't see any technical problem with GLB files containing a glTF file, a plTF file and the binary blob, if a plTF file needs to access the binary blob, it can do so the same way than glTF, by reimplementing Buffer and BufferView schemas.

What I am trying to prevent here is the Collada effect: it wanted to solve so many problems at once that it ended solving none.

Collada evolved a schema so bloated that there's not a single collada importer/exporter able to cover all the schema, leading to countless inconsistencies and incompatibilities.

Also, standards based on schemas with Automatic Code generation have a lot of limitations, and in the end, if you want to do something useful you need to hand write the code, which can only be done if the schema is kept to a manageable size. Developers keep using Wavefront OBJ not because it has a lot of features, but because writing a wavefront obj parser is simple enough.

In the end, the standards that survive over time are the ones that solve problems in the simplest possible way, not the ones that keep changing their specification. glTF is already mature and solves a problem, so let's freeze it. Physics is a different problem and needs to be adressed separately.

emackey commented 6 years ago

Actually the Khronos 3D Formats Working Group has in the past expressed interest in a physics extension being developed for glTF, and there shouldn't be any need to store such data externally.

As a general rule of thumb, if you have a small number of values per-node or per-material, such as the physics "mass" of a mesh or a "wind vector" on a node, your best bet is to store these as numbers or vectors within the JSON of your physics extension itself. Something like:

nodes: [
    {
        "name" : "wind",
        "extensions" : {
            "EXT_physics_example" : {  // or whatever you want to call it
                "windVector" : [ 0.0, 5.0, 0.0 ]
            }
        }
    }
]

But, if you have per-vertex data or other large quantities of data, you can place that in a glTF accessor, such that the actual data is in the binary portion of glTF. The accessor could be referenced from the associated mesh (see the _TEMPERATURE example). This saves space for transmission, and formats the data in a GPU-friendly way for a vertex shader attribute. But it carries a small quality loss: JSON numbers are plain-text parsed into 64-bit doubles, while accessor data is 32-bit floats for GPUs.

A properly formatted glTF extension can be marked optional, so that unaware readers are able to safely ignore the extra data, even in GLB form.

emackey commented 6 years ago

So, just to clarify, my comments above are just to show different ways a glTF extension can integrate itself into a model, and I didn't intend to convey anything about how a real physics extension should be laid out internally.

There was a comment, apparently now deleted, that mentioned that the physics mesh is often completely separate from the visual mesh, among other issues. Although I know very little about physics implementations, I still suspect there should be advantages to using the same file:

ghost commented 6 years ago

Any project which uses physics will probably use graphics (but not the other way around), and JSON parsers are trivial to integrate, so since JSON parsing is always available then any related arguments don't mean anything one way or the other. Anything I said to the contrary was based on a lack of knowledge of the glTF spec, which is always JSON + binary.

From a performance perspective, the location and storage of any JSON-based data (graphics or physics) doesn't really matter because there shouldn't be too much JSON data. JSON is way slower to parse than binary, and any non-trivial information should be binary (the big data).

The problem isn't so much performance as compliance. The glTF spec document is a whirlwind of information, winding paths leading to and fro. Just thinking about it makes me check myself to make sure sure I'm not seeing things; it reminds me of the cosmology-vs-religion question of where the buck stops. There are scenes which need nodes which need meshes which need primitives which need bufferViews which need buffers. The buck stops with buffers, right? At this point I'm not 100% sure; it's like learning OpenGL all over again.

tl;dr simplicity matters and it's not simple right now

vpenades commented 6 years ago

@emackey yes, the comment was posted by me, I didn't delete it, apparently Github is having issues; when I tried to edit my comment it disappeared altogether. I wanted to rewrite it, but now it's been more or less answered...

Certainly, as you say having everything in a single file has advantages, specially by reusing some gltf structures and internal bindings. But as @fluffrabbit says, this is a doble edged sword, with so many interdependencies the whole structure becomes so complex that it might end being almost unusable. This is why I talked before about the Collada Effect; what's the point of proposing a spec, so powerful, yet so complex, that almost nobody is able to write proper tools and exporters for it?

The core of my proposal lies on keeping things as simple as possible by applying divide and conquer, for the sake of our own sanity.

@fluffrabbit you can have projects using physics without graphics, as I stated in my first post, game servers might run the physics without graphics. But there's more uses, like serious/scientific simulation where graphics will be irrelevant.

@fluffrabbit tl;dr simplicity matters and it's not simple right now

That's exactly my point; the spec is already very complex, adding physics will make it even more complicated, it also adds more constraints to the spec, which can compromise the adition of future improvements.

vpenades commented 6 years ago

I took the liberty to write a sorts of pseudocode of what a physics structure looks like, based on my outdated experience and loosely taken from a bepuphysics ragdoll definition:

Entities
    Entity "Pelvis" absPosition= matrix Shape="PelvisShape"         
    Entity "Torso" absPosition= matrix Shape="TorsoShape"
    Entity "Neck" absPsition= matrix Shape="NeckShape"
    Entity "Head" absPsition= matrix Shape="HeadShape"
Entities

Constraints
    SwivelHingeJoint "Pelvis" "Torso" relPosition = 0,3,0 Softness=0.5 TwistLimits=xyz
    BallSocketJoint "Torso"   "Neck" relPosition = 0,2,0 Softness=0.5 TwistLimits=xyz
    BallSocketJoint "Neck"    "Head" relPosition = 0,2,0 Softness=0.5 TwistLimits=xyz 
Constraints

Shapes
    ConvexHull "PelvisShape"
        Plane xyzw "Glass"
        Plane xyzw "Metal"
        Plane xyzw "Metal"
        Plane xyzw "Metal"
        Plane xyzw "Metal"
    BSP "TorsoShape"
        Plane xyzw
        PlaneFrontShape "TorsoShape1"
        PlaneBackShape "TorsoShape2"
    ConvexHull "TorsoShape1"
        Plane xyzw "Metal"
        Plane xyzw "Glass"
        Plane xyzw "Metal"
        Plane xyzw "Metal"
        Plane xyzw "Metal"
    ConvexHull "TorsoShape2"
        Plane xyzw "Metal"
        Plane xyzw "Metal"
        Plane xyzw "Metal"
        Plane xyzw "Glass"
        Plane xyzw "Glass"
    Capsule "NeckShape" xyz xyz radius "Rubber"
    Sphere "HeadShape" xyz radius "Metal"
Shapes

Surfaces
        Surface "Metal"
        Surface "Rubber"
        Surface "Glass"
Surface 

So let's dissect it and see how different the structure is from a traditional glTF:

Instances loosely match gltf Nodes, but unlike with glTF, they're plain collections, there's no parent-child relations.

Constraints define how Instances are connected to each other, and the degree of freedom of movement between two Instances. Constraints are, by definition, two-way bindings

In physics, the geometrical shapes are defined in a wildly different way than in visuals.

In some cases, collections of ConvexHulls might be created programatically with Mesh Decomposition Tools, in that case, you probably need to set surface materials for every Plane in every convex hull. BUT you cannot split convex hulls per surface materials as you do with graphics.

So, beyond Buffer, BufferView and Accessor, there's almost nothing in the current glTF schema that could be reused efficiently for physics. You need a whole new set of structures and objects that, as with glTF, will be quite complex by themselves.

Even for the perspective of json, it makes sense to split the schema:

Let's say I have two developers, one is writing a pure graphics engine, the other one is writing a physics engine. Each engine will go in an independent library/package.

If the schema has merged visuals/physics, both will include lots of unneeded objects and structures, sure, you can skip the nodes you don't need, but you end with a bloated codebase.

With separated schemas, each developer can use only the schema it needs, and s simplified codebase eases development and prevents bugs.

And now, in the real world, except ThreeJS, I think what you have around is pure graphics engines or pure physics engines. As far as I know the most powerful one is Bullet Physics, which has zero graphics dependencies. Why should Bullet Physics drag around all those useless visual objects? just for the json parser to know how to skip them?

ghost commented 6 years ago

@vpenades Sounds sensible to me.

If the schema has merged visuals/physics, both will include lots of unneeded objects and structures, sure, you can skip the nodes you don't need, but you end with a bloated codebase.

While I like the plTF idea, the idea that a merged spec would increase code complexity is wrong if physics hierarchy has extremely close parity with graphics hierarchy and everything is implemented properly. You could do pure graphics engines and pure physics engines and the only bloat would be in the glTF files themselves. A decent JSON parser can ignore any objects without issue. That said, developers are human and it would still increase spec complexity and brain complexity.

I think that as an entry-level animated 3D graphics format, glTF is as close as anything has come to being usable. Definitely don't complicate it more. However, the spheres-and-planes model of physics is new to me and therefore sounds pretty complex on its own. Do physics meshes need to pull data from graphics meshes? Not necessarily. However, a starry-eyed Quake modder might want to model a Quake-like map in Blender and export it for graphics/physics usage. You know, something low-poly and static. I believe you in the game physics world call that a "concave hull". Or look at Red Faction's deformable terrain. I don't even know how it works.

Another potential issue: There could be some hangups with data redundancy and what node lines up with what, potentially leading to things getting "out of phase" if improperly implemented. It's easy to improperly implement things with specs as complex as glTF. Maybe plTF could reference glTF data somehow? But that might make it more complex. Maybe plTF has an extremely close parity with glTF hierarchy, I don't know. I do know that in glTF specifying transformations with a matrix means it can't be animated. We (I) want ragdolls, large climbable machines, and everything else. So many moving parts; got to share a lot of the structure that glTF uses.

vpenades commented 6 years ago

@fluffrabbit I don't think it's that complicated to reference assets from one file from the other.

In fact, given that in a real case scenario, for example a car with animated wheels, the physics game simulation is the one that integrates the position of the car chassis and the wheels. It would be the visual glTF nodes the one referencing the physical instances to pull the transform matrices, so the graphics engine can render the visual models at the locations published by the physics engine.

the idea that a merged spec would increase code complexity is wrong if physics hierarchy has extremely close parity with graphics hierarchy and everything is implemented properly

In physics, scenes don't use hierarchical node graphs, thus, it's impossible to reuse the visual scene node graph of gltf.

Do physics meshes need to pull data from graphics meshes?

Most probably no, visual and physics data is defined and arranged in completely different ways.

In the past (DirectX9 era) I've written both graphics engines and physics engines, and do require completely different approaches, in practice there's much less shared overlapping

What's clear is that the physics specification needs to be written by physics engines specialists, not by graphics engines specialists making assumptions of what physics specialists might need.

To be clear, I don't consider myself a physics specialists, even if I wrote a physics engine, it was very basic compared to the beasts we have around like Bullet or Bepu, it's guys behind these projects the ones that should be called to propose a spec.

ghost commented 6 years ago

Well, you probably know who does what with Bepu. @AndreaCatania is in charge of Godot's Bullet integration, and I'm sure he has opinions.

Maybe you guys could draw up a spec. From where I stand, I just want things to do things and I want to be able to understand it. :)

EDIT: Since nobody has chimed in yet, I think I'll add that blending glTF animations with plTF physics would be a good idea. Recent GTA games implement a similar system. The quaternion/scale/translation system is used for animations because those values interpolate smoother than 4x4 matrices. EDIT 2: Hierarchical vs non-hierarchical; matrix multiplications make choice of notation irrelevant. Sorry, I'm new to physics nitty-gritty.

@vpenades I wish I could help more than that, but I have no low-level experience with physics engines. Since this is your proposal and we're graphics specialists here, you may have to help yourself. Physics folks are all hot air anyways.