KhronosGroup / glTF

glTF – Runtime 3D Asset Delivery
Other
7.22k stars 1.14k forks source link

How to apply skin weights ? #1397

Closed andreasplesch closed 5 years ago

andreasplesch commented 6 years ago

joint transformation weights need to sum to 1. Otherwise there seems to be only a reference to the glTF overview and tutorial in the spec. on exactly how the weights are expected to be applied. The tutorial uses them as factors to the transformation matrices which may be industry practice ? Or just a tutorial convenience ? For example, it would be more appropriate to weight the resulting offsets in vertex positions after matrix - position multiplications rather than the matrices directly. The weighted offsets then sum to an effective offset. Or one could imagine using squared weights and then the square root of their sum. It all depends on how the weights are authored in the first place. What is typically provided by authoring tools, for glTF export purposes ?

javagl commented 6 years ago

I think I didn't fully understand the alternative implementations that you propose (i.e. weigh the vertex positions or using squared weights).

But note that the weights are not used as factors for the transformation matrices. Instead, they are used as factors for the joint matrices (maybe that's what you meant). The joint matrices are then combined as described in The skinning joints and weights section of the tutorial.

It may be important to point out that the skinning matrix in the vertex shader is computed for each vertex, and it depends on the joint matrices, which depend on the current configuration of the skeleton. I don't see whether/how this could sensibly be precomputed elsewhere.

Regarding the authoring tools, I'll have to pass this part of the question maybe to someone who already worked on exporter plugins....

andreasplesch commented 6 years ago

Thanks for the response. The general point is the term 'weight' does not by itself explain how it is applied so that quantities with a higher weight have more influence than ones with a lower weight.

It sounds like it is accepted practice to use the weights as factors to the joint matrices and then produce their linear combination. But this seems more like an ad-hoc convention than reliable procedure.

Each joint results in an offset of a vertex affected by that joint, the difference between the deformed position and the initial position. The amount of offset is modulated by the weight of that joint for this particular vertex. So to me the immediate understanding of the weight would be as a fraction of that offset. All weighted offsets can then be summed to an effective offset, to be added to the initial position.

Weighting is used before combining several terms, often for averaging. The linear combination of weighted terms as a sum corresponds to an arithmetic weighted average if the sum of weights is 1.

(w1 a1 + w2 a2 + w3 a3 ..)/(w1 + w2 + w3 ..) = ar.avg

As an alternative one could consider using a weighted geometric mean (perhaps a better example than the weighted distance with the squares):

(a1^w1 a2^w2 a3^w3 ..)^(1/(w1 + w2 + w3 ..)) = geo.avg

So instead of using the arithmetic average one could imagine using the geometric average to determine an effective, combined quantity. Therefore, it seems necessary to establish somewhere how the weights should be applied if it is not considered self-evident.

andreasplesch commented 6 years ago

It looks like three.js is applying the weights to the deformed vertex positions:

https://github.com/mrdoob/three.js/blob/dev/src/renderers/shaders/ShaderChunk/skinning_vertex.glsl

summing then the weighted positions. I think this is different from summing the weighted matrices, and then transforming the vertex with this sum.

I think the three procedure is probably equivalent to what I tried to describe above, eg. applying the weights to the offsets:

resulting position = o + sum[( d_i - o ) w_i] = // original pos. plus weighted sum of offsets o + sum[( d_i w_i - o w_i)] = o + sum[( d_i w_i)] - sum[ o w_i ] = o - o sum[ w_i ] + sum[ d_i w_i ] = o - o 1 + sum[ d_i w_i ] = // if weights sum to 1 = sum[ d_i w_i ] // sum of weighted deformed positions

where o is initial, original position d is deformed position (after applying matrix) w is the joint weight i is joint index sum is sum over all joints for this vertex

So, yes it is equivalent at least as long as all weights sum to 1. But the three procedure seems different from what the tutorial suggests.

Summing the weighted matrices may work the same as summing the weighted, deformed vertices as long there is no translational component to the matrices. This would be the case for the tutorial, I believe, but not in general.

andreasplesch commented 6 years ago

I think glTF skinning is designed after Collada skinning: Section 4.7 in

https://www.khronos.org/files/collada_spec_1_5.pdf

describes Collada skinning, in a well defined manner which is easier to understand.

In particular, there is an equation on how to apply weights (and related matrices). The equation applies the weight as a factor to the final position after transforming the original position of the vertex with the matrices (as three.js does).

This seems different from what the tutorial and the glTF overview poster seem to suggest as there the weights are applied to matrices before transforming the original position.

If glTF is following Collada, it may suffice for the spec. just to point at the Collada spec. If the specifics on how to apply weights are deliberately left to generators and clients, perhaps to allow for special closed environments with their own weighting system, there could be just an implementation note instead of authoritative language.

javagl commented 6 years ago

I hope that some real skinning expert will soon chime in here. Most of what I know about skinning is what I read for 1. a basic implementation, 2. the overview and 3. the tutorial. These things are thus based on the same information, which I mainly derived from the COLLADA spec, along with https://www.khronos.org/collada/wiki/Skinning and some websearches. So I'm on rather thin ice when arguing about technical details or alternatives here.

The linear combination of weighted terms as a sum corresponds to an arithmetic weighted average if the sum of weights is 1.

If I understood this correctly, it was addressed in this PR: https://github.com/KhronosGroup/glTF/pull/1352 The weights must sum up to 1.0 (basically implying a linear combination, or, more precisely, an affine combination)

I think this is different from summing the weighted matrices, and then transforming the vertex with this sum.

It might be necessary to more closely examine what the "bone matrices" in three.js are - from a quick glimpse at the loader, it looks like they are just the transformation matrices of the joints.

Referring to the COLLADA formula: To add some more confusion, the "JM" there are also the transformation matrices of the joints, which are not the same as the "joint matrices"

A quick attempt:

Imagine there are 4 joints, Then the formula from the COLLADA document is basically

outv =  v * jointMatrix.x * jointWeight.x +
        v * jointMatrix.y * jointWeight.y +
        v * jointMatrix.z * jointWeight.z +
        v * jointMatrix.w * jointWeight.w +

where jointMatrix = JMi * IBMi (the order is reversed due to the left-multplication of the vertex - let's ignore that for now. Also, the BSM is premultiplied with the IBM or mesh data in glTF). This should be the same as

outv = (jointMatrix.x * jointWeight.x +
        jointMatrix.y * jointWeight.y +
        jointMatrix.z * jointWeight.z +
        jointMatrix.w * jointWeight.w)  * v

where the first part is what is computed as the skinMat in the tutorial.

Sorry, all this may be a bit vague and confusing, and maybe I made a serious error. It could in fact be that there is a difference, but I currently cannot point my finger at it. (It's been a while since I read the first and only time about vertex skinning).

Since the weights are only scalars, there are certainly some degrees of freedom of what they are multiplied with. Or to put it that way, the question seems to be: The weights define an affine combination ... of what?.

They could combine the joint matrices (to create the skinning matrix), or they could combine the vertices (which have been transformed with the joint matrices). I'd have to invest some more time to figure out whether there are cases where this makes a difference. Maybe someone can clarify it quickly.

andreasplesch commented 6 years ago

-- AP on the road

On Tue, Jul 24, 2018, 4:35 PM Marco Hutter notifications@github.com wrote:

I hope that some real skinning expert will soon chime in here. Most of what I know about skinning is what I read for 1. a basic implementation, 2. the overview and 3. the tutorial. These things are thus based on the same information, which I mainly derived from the COLLADA spec, along with https://www.khronos.org/collada/wiki/Skinning and some websearches. So I'm on rather thin ice when arguing about technical details or alternatives here.

The linear combination of weighted terms as a sum corresponds to an arithmetic weighted average if the sum of weights is 1.

If I understood this correctly, it was addressed in this PR: #1352 https://github.com/KhronosGroup/glTF/pull/1352 The weights must sum up to 1.0 (basically implying a linear combination, or, more precisely, an affine combination)

I think this is different from summing the weighted matrices, and then transforming the vertex with this sum.

It might be necessary to more closely examine what the "bone matrices" in three.js are - from a quick glimpse at the loader, it looks like they are just the transformation matrices of the joints.

Referring to the COLLADA formula: To add some more confusion, the "JM" there are also the transformation matrices of the joints, which are not the same as the "joint matrices"

A quick attempt:

Imagine there are 4 joints, Then the formula from the COLLADA document is basically

outv = v jointMatrix.x jointWeight.x + v jointMatrix.y jointWeight.y + v jointMatrix.z jointWeight.z + v jointMatrix.w jointWeight.w +

where jointMatrix = JMi * IBMi (the order is reversed due to the left-multplication of the vertex - let's ignore that for now. Also, the BSM is premultiplied with the IBM or mesh data in glTF). This should be the same as

outv = (jointMatrix.x jointWeight.x + jointMatrix.y jointWeight.y + jointMatrix.z jointWeight.z + jointMatrix.w jointWeight.w) * v

This is where I think more scrutiny is required. Matrix multiplication may not quite allow for that step.There is probably a good reason why the Collada formula and three do not do this.

where the first part is what is computed as the skinMat in the tutorial.

Sorry, all this may be a bit vague and confusing, and maybe I made a serious error. It could in fact be that there is a difference, but I currently cannot point my finger at it. (It's been a while since I read the first and only time about vertex skinning).

Since the weights are only scalars, there are certainly some degrees of freedom of what they are multiplied with. Or to put it that way, the question seems to be: The weights define an affine combination ... of what?.

They could combine the joint matrices (to create the skinning matrix), or they could combine the vertices (which have been transformed with the joint matrices).

In the end it depends on how generators expect the weights to be used. But I do suspect by now that it is the transformed vertices which are weighted and combined, in most generated glTFs.

I'd have to invest some more time to figure out whether there are cases where this makes a difference. Maybe someone can clarify it quickly.

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/KhronosGroup/glTF/issues/1397#issuecomment-407543066, or mute the thread https://github.com/notifications/unsubscribe-auth/AF4p64OAR0ifKQug9M01ZoplWBIfestKks5uJ4USgaJpZM4VT41d .

scurest commented 6 years ago

My own (contingent and highly fallible) understanding is that the position of a vertex in a mesh instance is given like this:

skin

(sorry for the image; markdown is not very good for math) The only problem is that if an inverse bind matrix does not have final row (0 0 0 1) then it's not clear how to compute the product of that 4x4 matrix with the 3-vector P_{model} (ie. do you truncate or do perspective division? And what is the underlying linear map?) but that is not a very common case.

andreasplesch commented 6 years ago

(I think mu_POS(k) means the kth morph target).

Looking past the morph target treatment, and to the last equation where the joint weights are applied, I think it is consistent with saying that transformed positions scaled by weight (in world space) are summed rather than only scaled C_j transforms. Following the naming above this alternative could be expressed as:

P_world := (sum_j(omega_j * C_j * InvBind_j)) * P_model

which I think is not what would be commonly expected for how weights are applied.

scurest commented 6 years ago

What's the difference? For affine f_i surely

affine

andreasplesch commented 6 years ago

I am looking for a reason why

https://github.com/mrdoob/three.js/blob/dev/src/renderers/shaders/ShaderChunk/skinning_vertex.glsl

        vec4 skinVertex = bindMatrix * vec4( transformed, 1.0 );

    vec4 skinned = vec4( 0.0 );
    skinned += boneMatX * skinVertex * skinWeight.x;
    skinned += boneMatY * skinVertex * skinWeight.y;
    skinned += boneMatZ * skinVertex * skinWeight.z;
    skinned += boneMatW * skinVertex * skinWeight.w;

    transformed = ( bindMatrixInverse * skinned ).xyz;

does not use the simpler form:

        vec4 skinVertex = bindMatrix * vec4( transformed, 1.0 );

        mat4 skinMatrix = mat4( 0.0 );
    skinMatrix += skinWeight.x * boneMatX;
    skinMatrix += skinWeight.y * boneMatY;
    skinMatrix += skinWeight.z * boneMatZ;
    skinMatrix += skinWeight.w * boneMatW;

        skinned = vec4(skinMatrix * skinVertex);
    transformed = ( bindMatrixInverse * skinned ).xyz;
scurest commented 6 years ago

Those compute the same value for skinned

form

Which one is simpler is a matter of opinion I guess, but, counting, it looks like the former does do quite a few more multiplies, though the later does more additions.

andreasplesch commented 6 years ago

Here is a count of GPU operations: 4 mat4 with vec4 multiplications 4 vec4 with scalar multiplications 4 vec4 additions versus 4 mat4 with scalar multiplications 4 mat4 additions 1 mat4 with vec4 multiplication

andreasplesch commented 6 years ago

I went ahead and replaced

https://github.com/mrdoob/three.js/blob/dev/src/renderers/shaders/ShaderChunk/skinning_vertex.glsl

with

#ifdef USE_SKINNING

    vec4 skinVertex = bindMatrix * vec4( transformed, 1.0 );

    skinMatrix = mat4( 0.0 );
    skinMatrix += skinWeight.x * boneMatX;
    skinMatrix += skinWeight.y * boneMatY;
    skinMatrix += skinWeight.z * boneMatZ;
    skinMatrix += skinWeight.w * boneMatW;

        vec4 skinned = vec4(skinMatrix * skinVertex);
        transformed = ( bindMatrixInverse * skinned ).xyz;
        # transformed = ( bindMatrixInverse * vec4(skinMatrix * bindMatrix * vec4( transformed, 1.0 ))).xyz;

#endif

using local overrides in Chrome developer tools. I tested all gttf skinning examples and also three.js skinning examples. All work as expected with the modified skinning vertex shader code.

I also went through math for 3d explicitly and agree that these are equivalent, eg. that in deed it does not matter if the weights are applied to the matrices or the transformed positions. This is in contrast to what I had initially suspected.

So the tutorial approach should be ok but I do think it is necessary that others carefully analyse this to confirm.

[For three.js the above glsl may be interesting since it could be faster and also potentially reuse the skinMatrix already calculated for normals, in https://github.com/mrdoob/three.js/blob/dev/src/renderers/shaders/ShaderChunk/skinnormal_vertex.glsl ]

scurest commented 6 years ago

Good point about normals. You should be able to compute them with something like

skinned_normal = (skinMatrix * vec4(normal, 0.0)).xyz;
skinned_normal = normalize(skinned_normal);
sbtron commented 5 years ago

Closing this for now as discussion seems to have resolved. Please reopen if there are more questions