KhronosGroup / glTF-Sample-Models

glTF Sample Models
3.08k stars 1.31k forks source link

How advanced skinning are implemented #312

Closed icEngineer-tech closed 3 years ago

icEngineer-tech commented 3 years ago

Hi,

I want to implement an algorithm for advanced skinnig. So I have few questions; first, how can i know which animations start first (suppose that the rotation starts then the translation comes but both transformations have the same key frames and the translation is 0 and only changes after 2 seconds.)? If i understand, I have to access each animation and change the mode that it belongs to, accordingly to the value at that key frame?

javagl commented 3 years ago

It sounds like this is only implicitly about skinning, but primarily about animation. Specifically, the question

how can i know which animations start first

sounds like what is "explained" in the Notes at https://github.com/KhronosGroup/glTF/tree/master/specification/2.0#animations : There may be multiple animations, and they are supposed to be usable independently. This may also answer the question about the "duration" of animations. But note that there is a caveat regarding the channels of one animation. This is explained at https://github.com/KhronosGroup/glTF-Sample-Models/tree/master/2.0/BoxAnimated (and this is a good test model to start with basic animations). Essentially, when there are multiple channels with different lengths, then the longest one determines the length of the animation, and the individal channels may not "wrap around" or so.

In order to test basic skinning+animation, https://github.com/KhronosGroup/glTF-Sample-Models/tree/master/2.0/RiggedSimple may be good start.

icEngineer-tech commented 3 years ago

sounds great. I still have a confusion with this formula:

jointMatrix(j) =
  globalTransformOfNodeThatTheMeshIsAttachedTo^-1 *
  globalTransformOfJointNode(j) *
  inverseBindMatrixForJoint(j);

the 2 last variables it's ok, I understood them, but I struggle with the first one. imagine I have this configuration:

"nodes" : [ {
    "skin" : 0,
    "mesh" : 0
  }, {
    "children" : [ 2 ],
    "translation" : [ 0.0, 1.0, 0.0 ]
  }, {
    "rotation" : [ 0.0, 0.0, 0.0, 1.0 ]
 } ]

So the globalTransformOfNodeThatTheMeshIsAttachedTo^-1 would be the inverse of translation = 0, the rotation = default (no rotation) and the scale = 1 because this node doesn't implement any transformations. but it doesn't work as that way. I get something wrong. When I tried translation = 0, 1, 0 as in the first node, rotation as in node 1 and scale = 1 it does give me something right.

Imagine I have this model:

"nodes": [
  {
      "name": "Light",
      "rotation": [
        0.16907575726509094,
        0.7558803558349609,
        -0.27217137813568115,
       0.570947527885437
      ],
      "translation": [
        4.076245307922363,
        5.903861999511719,
        -1.0054539442062378
      ]
    },
    {
      "name": "Camera",
      "rotation": [
        0.483536034822464,
        0.33687159419059753,
        -0.20870360732078552,
       0.7804827094078064
      ],
      "translation": [
       7.358891487121582,
        4.958309173583984,
       6.925790786743164
     ]
    },
    {
      "name": "Bone.00",
      "rotation": [
        -0.0020970599725842476,
        -0.001136002130806446,
        1.6966068869805895e-05,
        0.9999971389770508
     ],
      "scale": [
       0.9999999403953552,
       0.9999997615814209,
        0.9999998807907104
      ],
      "translation": [
        3.7182687484538235e-12,
        1.3126283884048462,
        -2.1039745501383322e-09
      ]
    },
    {
      "children": [
        2
     ],
      "name": "Bone.01",
      "rotation": [
        0.011737177148461342,
        0.002114036586135626,
        -0.00010634021600708365,
        0.999928891658783
      ],
      "scale": [
        1,
        1,
        0.9999999403953552
      ],
      "translation": [
        3.3658409392955946e-11,
        0.8158659338951111,
        1.369617641522325e-09
      ]
    },
    {
      "children": [
        3
      ],
      "name": "Bone.02",
      "rotation": [
        -0.5002185702323914,
        0.49416103959083557,
        0.505838930606842,
        0.4997132420539856
      ],
      "scale": [
        1,
        1,
        1.0000001192092896
      ],
      "translation": [
        3.1174166202545166,
        -0.02734565921127796,
        -3.7103995431664316e-09
      ]
    },
    {
      "mesh": 0,
      "name": "Plane",
      "skin": 0
    },
    {
      "children": [
        4
      ],
      "name": "Armature",
      "rotation": [
        -0.4999999701976776,
        -0.5,
        0.5,
        -0.5000000596046448
      ],
      "translation": [
        -0.866496205329895,
        -1.5603487491607666,
        1.647308349609375
      ]
   }
  ]

This is an example that I've created under Blender. So I wonder how to get the globalTransformOfNodeThatTheMeshIsAttachedTo^-1? is it from the node 5 (starting from 0) and how to propagate then to get it?

javagl commented 3 years ago

An aside: I have edited the previous comment for formatting. When you have to post multiple lines of code, you can do this with

```
// Your code here
```

(these are three backticks at the beginning and the end).

Beyond that, it would be good to attach the (complete) model here, maybe in a ZIP file, so that one can try it out easily. But I'll have another look at the comment/questions later.

icEngineer-tech commented 3 years ago

thanks for this tip. I thought it's only available on Stack Overflow 😄

I attached the model with its bin file. skinTube.zip

I really want to implement the algorithm to make this model working: https://github.com/KhronosGroup/glTF-Sample-Models/tree/master/2.0/CesiumMan

the model that you pointed in the previous comment (Rigged Simple) uses matrix and my code didn't support this feature yet, but I'm planning to implement it once I finish with T, R, S because matrix needs one more job that I have to decompose it into T, R, S.

Now I want to describe you what I understood from skinning: first, I have to get the global inverse transform which is a constant and this is from the node that is attached to the mesh (in other terms, the root node). naturally, I did implement this method. Next, I have to update the global transform each time I have an animation done. But here, I have to take care about the children. If a parent has a child that's mean that the transformation of that parent has to be transmitted to the child? so when the child has a child, I have to propagate that logic until no child is present? here I have the idea to use a linked list. here you can find my skinned animation class: https://github.com/CppProgrammer23/skin-animation . can you please help me to fix it (there are no errors but I want to fix the logic)

javagl commented 3 years ago

It's hard to align the questions with the code. I could now read the code, and try to understand it. (Nitpicks: 1. The make... functions could be static, and 2. Function parameters should usually be const and references (&), particularly when they are vectors). But there's too much going on in the constructor (for me to understand it by just looking at it)


Trying to focus on the actual questions:

(Rigged Simple) uses matrix and my code didn't support this feature yet, but I'm planning to implement it once I finish with T, R, S because matrix needs one more job that I have to decompose it into T, R, S.

That's a bit surprising. Is there any specific reason why you want to decompose the matrix?

Usually, from a plain glTF/rendering perspective, it is the other way around. In an implementation, like a class Node { ... }, it is common to store the matrix, as a glm::mat4 (or some other form of 16 float/double values). When parsing a glTF, there often is code that

  1. uses the matrix from the glTF (if it is present)
  2. if the glTF contains T/R/S properties, a matrix is computed from these T/R/S properties, and stored in the node

Note that keeping the T/R/S individually is not "wrong", but ... may be cumbersome. More important: Not every 4x4 matrix can be decomposed into T/R/S. And the "propagation" of the transformation is more complicated (more on that below).


first, I have to get the global inverse transform which is a constant and this is from the node that is attached to the mesh

Nitpicking: The mesh is attached to the node (or, one could also say, "instantiated" by the node: One mesh can be attached to multiple nodes!).

More important: The inverse of the global transform of this node is not necessarily constant.


Next, I have to update the global transform each time I have an animation done. But here, I have to take care about the children. If a parent has a child that's mean that the transformation of that parent has to be transmitted to the child? so when the child has a child, I have to propagate that logic until no child is present? here I have the idea to use a linked list.

For the implementation itself, there are many options. You could either use some (existing) sopisticated rendering engine. Or you could implement the whole rendering structures on your own. But the latter could only be a very simple one, because... you could literally sink years of work into a "rendering engine"....

However: It is conceptually correct that the transformation is "propagated". Imagine a class like this:

class Node {

    Matrix matrix = null;
    Vector translation, rotation, scale = null;

    Matrix getLocalMatrix() {
        if (matrix != null) return matrix;
        else return createMatrixFrom(translation, rotation, scale);
    }

    Matrix getGlobalMatrix() {
        Node parent = findParentOfThis();
        if (parent == null) return getLocalMatrix();
        Matrix globalMatrixOfParent = parent.getGlobalMatrix();
        return globalMatrixOfParent * getLocalMatrix();
    }
}

In the last line, globalMatrixOfParent * getLocalMatrix(); is the "propagation": The given node just "appends" its own (local) matrix to the global matrix of the parent.

One difficulty with implementing that: The "parent" of a node is not stored in glTF. I've sneaked the findParentOfThis() function in there. Of course, one could store the parent of each node when reading the node structure from glTF, but this has to be done as a dedicated step.

However, from a short look at the code (and the question "I have to propagate that logic until no child is present?"), you seem to try to compute the matrix in the opposite direction. This is also possible, but it's a bit complicated ... and I think* that, in order to implement that sensibly, you'd have to store the transform from the parent node in each node (and probably introduce some "dirty" flags to avoid unnecessary updates...).

icEngineer-tech commented 3 years ago

That's a bit surprising. Is there any specific reason why you want to decompose the matrix?

to update the translation, rotation and scale matrices?

More important: The inverse of the global transform of this node is not necessarily constant.

if we have one mesh, is not still constant?

there are a lot of information, I have to understand that point by point. Thank you.

I will try to update my code because I have something like that coming to my mind:

void traverseNode(float time, Animation animation, tinygltf::Node node, const std::vector<glm::mat4>& transform)
{
  //update the rotation, translation and scaling
  for(every child)
      traverseNode(time, animation, TinyglTFRender::model.nodes.at(child), globalTransform);
}

I will try this and then will be back

javagl commented 3 years ago

to update the translation, rotation and scale matrices?

That's part of the question: Why are you storing them individually?

Again: glTF basically provides two ways to define the transform matrix of a node:

But at runtime (i.e. in a rendering engine), you usually store only a matrix. (The matrix may be computed from the T/R/S properties, but it's unusual to store the T/R/S properties individually)

(Note: All this is a bit simplified. Of course, you might have some sophisticated rendering engine, and might want to have functions like glm::vec3 translation = node.getTranslation();. But for glTF and its rendering itself, this is just not necessary).

The traverseNode function that you sketched there is basically the iterative version of the (recursive) getGlobalMatrix function that I described. With a function like

void traverseNode(Node node, Matrix currentMatrix)
{
    node.globalTransform = currentMatrix;
    Matrix nextMatrix = currentMatrix * node.localTransform;
    for (Node child : children) {
        traverseNode(child, nextMatrix);
    }
}

// Call:
traverseNode(gltfRoot, identityMatrix);

you could compute the global transforms of all nodes, and assign them to the nodes as node.globalTransform.

There are some pros and cons for all approaches. Very roughly speaking:

The recursive solution that I sketched requires the "parent" nodes to be stored, and will re-compute the global transform each time when getGlobalTransform is called. For your solution, the traverseNode function will probably be called after each modification (i.e. after each animation step), and will compute all global matrices.

(Which one is "better"? I don't know - in doubt, it might depend on the structure of the node hierarchy. But for rendering, the global transforms are required anyhow, so your approach may well have an advantage here)

scurest commented 3 years ago

You don't need to multiply by globalTransformOfNodeThatTheMeshIsAttachedTo^-1. It's only purpose is to cancel out the multiplication by globalTransformOfNodeThatTheMeshIsAttachedTo that is applied to the vertices of unskinned meshes. For skinned meshes, just don't multiply by globalTransformOfNodeThatTheMeshIsAttachedTo, and you won't need to multiply by (or even calculate) the globalTransformOfNodeThatTheMeshIsAttachedTo^-1.

javagl commented 3 years ago

@scurest

For skinned meshes, just don't multiply by globalTransformOfNodeThatTheMeshIsAttachedTo

It might be possible to somehow sneak around that for the skinning computations themself (I'd have to scribble a down the math with pencil+paper to be "more sure" about that). But at some point, the globalTransformOfNodeThatTheMeshIsAttachedTo will be taken into account - otherwise, you couldn't move the skinned object around in the scene. So... 1. are you sure that this is possible? 2. did you implement it like that? and 3. can you still move around your skinned models by applying a transform, e.g. to the root node?

(An aside: I haven't checked that, but it might well be that none of the skinning test models uses a global transform that is not the identity matrix. If this is the case, it could give implementors the impression that their skinning works, even though their implementation might fail when the model is transformed. This has to be analyzed further, and if this is the case, it might be worth adding a dedicated sample model to test this...)

javagl commented 3 years ago

@CppProgrammer23

Don't dispair. The skinning is the hardest part (or... well... at least, before PBR was introduced). I had a short look at the code, but again: It's hard to figure out the intention of some parts, or why they are implemented like that. For example, a function like the glm::mat4 SkinnedAnimation::getGlobalInverseTransform(tinygltf::Scene scene) looks a bit odd: It determines some node as the rootNode, and from quickly looking over it, it computes "something", but certainly not a global transform. The traverseNode function is not called, from what I can see.


And by the way: It may be a matter of style, but I don't see a reason why someone should implement a function like

Matrix getGlobalInverseTransform(Node node) {
    // many lines of code
    ...
    return glm::inverse(resultMatrix);
}

instead of

Matrix getGlobalTransform(Node node) {
    // many lines of code
    ...
    return resultMatrix;
}
Matrix getInverseGlobalTransform(Node node) {
    return glm::inverse(getGlobalTransform(node));
}

I mean, the function to compute the global transform of a node will be required anyhow. And... of course, this function should not check for a certain node.name, or modify the state of the object that it is a member of, ... but... all I want to say is: It's hard to understand the goals of the current code, for me...

icEngineer-tech commented 3 years ago

When I wrote some specific code for the model that you have created (simpleSkin), it worked for me, but now I added this class SkinnedAnimation to make it generic for many models. I did understand how it worked when you explained it to me in the previous issue. Naturally, I have understand many parts in the Skinned animation but still confused with some parts. for example: how can I calculate the globalInverseTransform to be generic for every model? As it's mention in the tutorial, it's the inverse of the global Transform of node that the mesh is attached to, Imagine I have many roots in my glTF model as the model I linked in the previous comment (Camera, Light, etc..). How can I get the real root? this is exactly what make me confused. Naturally, I updated my code here: https://github.com/CppProgrammer23/skin-animation to call `traverseNode in startAnimation (when animation starts). can you just take a look on those methods and tell me what do you think. The methods are: traverseNode and startAnimation. The rest are just Getters and/or Setters.

What I really can't understand is in your tutorial about Skinning you mention that the JointMatrix is variable in function of the globalTransform and the inverseBindMatrices. For me, the globalInverseTransform is constant. And if you can describe me how can I get the globalInverseTransform (to be generic)

scurest commented 3 years ago

@javagl We've discussed this before. Yes I am sure. About skinning the spec says "...while ignoring the transform of the skinned mesh node". You also wrote that it's only purpose was to cancel out part of the modelview matrix. Yes, the root node still affects the skinned mesh, because it affects globalTransformOfJointNode.

javagl commented 3 years ago

@scurest You're right, but I still did not do all the maths behind that. I did that back when I wrote the tutorial and the overview, being completely new to vertex skinning, and digging through online resources and the COLLADA specification. But there had been some discussions, updates, clarifications and (apparently) simplifications in the meantime that I had not yet taken into account.

But after doing some archeology ( https://github.com/KhronosGroup/glTF/issues/1270 , https://github.com/KhronosGroup/glTF/issues/1403 , and more recently, the one that you now linked to), I think that you're right. I have read a bit in these issues, and tried out different cases (e.g. https://bghgary.github.io/glTF-Assets-Viewer/?manifest=https://raw.githubusercontent.com/KhronosGroup/glTF-Asset-Generator/v0.6.0/Output/Manifest.json&folder=2&model=10 ), with my implementation, once with and once without the global transform being integrated, and the result seems to be the same.

(I'm still not "scientist-sure" about that, because I didn't do the math all the way down, but ... I'm "engineer-sure": It seems to be correct from all what I've read and tried out...)

This means that the tutorial, the overview (and my implementation) have to be updated accordingly. I'll open an issue for that (but cannot say for sure when I'll be able to tackle that - I've been doing far too little for my "real" work recently anyhow...).