godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
90.99k stars 21.16k forks source link

Transform related APIs on Mono version works differently since #16326 #16937

Closed mysticfall closed 6 years ago

mysticfall commented 6 years ago

Godot version: master / 6e200b1fe0b03bc0acc6d7ab097750d87aa7ac67

OS/device including version: Manjaro Linux 17.1

Issue description: Since changes made in #16326, my character controller behaves differently which relies on manipulating bones from the code.

What it does is caching a certain bone's global pose by invoking Skeleton.GetBoneGlobalPose on _Ready() and restoring it via SetBoneGlobalPose in _PhysicalProcess to simulate 'root motion' animation which is currently missing from Godot.

Before 6e200b1fe0b03bc0acc6d7ab097750d87aa7ac67, it behaved as expected but after I applied the changes from the commit, my character stands upside down and walks beneath the ground.

I'm not sure if it's a regression or just invalid data remaining in my scene files, if the commit changed the way transform data is marshalled.

At any rate, I suppose it could potentially break existing projects and better be documented how to fix the problem if the current status is how it's supposed to work.

akien-mga commented 6 years ago

CC @neikeq @NathanWarden

mysticfall commented 6 years ago

It looks like it's not just Skeleton.SetBoneGlobalPose that behaves differently now, but other global transformation related operations are affected as well.

If I understood Godot's transform API correctly, the following code should move Target toward the direction it is currently heading:

        public void _PhysicalProcess(float delta) {
            var local = new Vector3(0, 0, 0.2f);
            var global = Target.GlobalTransform.Xform(local) - Target.GlobalTransform.origin;

            Target.MoveAndSlide(global);
        }

Before 6e200b1fe0b03bc0acc6d7ab097750d87aa7ac67 it behaved as such, but now it always moves along the global z axis regardless of where it's facing.

If I was correct in assuming the former behavior was the expected one, I think it could be a quite serious issue since it may affect almost any C# projects that move things around in game.

neikeq commented 6 years ago

Does it work different to GDScript?

mysticfall commented 6 years ago

I'm sorry but I've never used GDScript so I'm afraid I won't be able to test it right away.

cart commented 6 years ago

This is a problem for me too. I put together a project to illustrate it. It rotates a MeshInstance along the z-axis, then has another MeshInstance copy the transform from the rotating mesh.

In C#, the z rotation flips its sign. In GDScript, the transform is copied as expected.

The project defaults to c#, but the gdscript file is included too.

BrokenTransform.zip

mysticfall commented 6 years ago

Changed the title according to the new information. And I think we can remove the documentation label now, since it's apparently not the correct behavior but an actual bug.

By the way, have the changes from #16326 made to the 3.0.1 version? If so, I think we should even consider making a new build with either a fix for the problem, or a patch to roll back the changes.

I don't like the idea of people using 3.0.1 version only to find out that they can't properly move things around, or finding out that scenes made with 3.0.1 version all get messed up once the issue gets fixed.

NathanWarden commented 6 years ago

This change should definitely change how rotations worked before in C# as it was broken. It should match GDScript now, if not please let me know.

NathanWarden commented 6 years ago

Nevermind, thanks for the sample project @cart. It's hard to say what's causing this issue without taking a closer look. I don't think reverting my commit is the correct solution since that will re-break the basis rotations that were rotating the opposite of GDScript, just as this is rotating the opposite at the transform level. I'll try and fix this issue later today.

NathanWarden commented 6 years ago

Okay, I think I got it. I should have checked to make sure the marshaling in was correct also. Let me do some tests to make sure these fixes are all correct, then I'll make a new PR :)

NathanWarden commented 6 years ago

Okay, I put up a PR of the fix with a sample project based on the BrokenTransform project from @cart

https://github.com/godotengine/godot/pull/17046

akien-mga commented 6 years ago

Fixed by #17046.

mysticfall commented 6 years ago

Thanks much @NathanWarden!

@akien-mga Could we get this commit backported to 3.0.1?

akien-mga commented 6 years ago

That's what this means: spectacle y21753

mysticfall commented 6 years ago

That's cool. I wasn't sure as it still has its milestone set for 3.1. Thanks!

mysticfall commented 6 years ago

@NathanWarden I just tested the new version and now I'm getting a new problem. I have a code that rotates the character in _PhysicalProcess as below:

Target.GlobalTransform = rotation * Target.GlobalTransform;
Target.RotateObjectLocal(new Vector3(0, 1, 0), rotationalVelocity.y);

Before the fix, it worked as intended but now it looks like the change in rotation is immediately reverted back causing the screen to oscilliate constantly.

I could even reproduce the symptom with the following code:

Target.GlobalTransform = new Transform(Basis.Identity, new Vector3()) * Target.GlobalTransform;
Target.RotateObjectLocal(new Vector3(0, 1, 0), 1f);

If I understood Godot's transform API correctly, it should rotate the target object on Y-axis at a constant speed, no matter what its original transform was.

Could you please confirm if it's not something caused by my own mistake? And if you think I should create a separate issue, please let me know.

NathanWarden commented 6 years ago

@mysticfall I'd be more than happy to take a look. Can you send me a scene reproducing the issue? I started to put a scene together and realized I'm not sure how it should be setup.

mysticfall commented 6 years ago

Thanks, I'll try to find some time to create a minimal project that can reproduce the issue in a couple of days. Maybe it will find some of my own mistakes, if there were any.

NathanWarden commented 6 years ago

Awesome, thanks! When you do get the chance, if you can have a GDScript version along with it would be awesome just so that I can see and compare how it should work :)

cart commented 6 years ago

Thanks for fixing the transform! Buuut it turns out this isn't quite fixed yet. It turns out basis.GetEuler() is also broken. In my example project swap:

Transform = _trackedSpatial.Transform;

for

Translation = _trackedSpatial.Transform.origin;
Rotation = _trackedSpatial.Transform.basis.GetEuler();

Sadly this brings back the exact same "negative z" behavior

mysticfall commented 6 years ago

@cart Thanks for proving that I'm not crazy, again :)

By the way, could you revise your previous example to demonstrate the newly broken behavior?

I can't normally work on Godot during weekdays, so it'll take a couple of days until I can be free to create an example project.

I'd appreciate if you could update your project, if it's not too much of a hassle for you.

EDIT: Oh, I just realized you meant the problem can be easily reproducible with simply replacing a single line in your previous example.

NathanWarden commented 6 years ago

Thanks @cart for providing the example. Was this also broken before, or did this break with my recent changes? Thanks! :)

cart commented 6 years ago

It broke with your recent change :)

akien-mga commented 6 years ago

For the reference, we're about to publish 3.0.2 with some fixes, including a revert of the Basis marshalling change (so going back to how it worked in 3.0): https://downloads.tuxfamily.org/godotengine/3.0.2/rc1/mono/

We can redo the Basis change once it's fully sorted out in the master branch.

NathanWarden commented 6 years ago

Sounds good thanks Remi. I'll hopefully be able to sort it out sometime next week. :)

mysticfall commented 6 years ago

Maybe we need to reopen this issue, if it'd take sometime before getting fixed?

NathanWarden commented 6 years ago

I'm currently interviewing for a new job and trying to get quite a bit of client work done, which means I don't think I have much time to look into it anytime soon. I can't imagine the fix is a difficult one, just that the source of the problem needs to be found. If anyone else wants to take a stab at it feel free.

If you can run my test suite on it that will prove that the directional vectors are correct. I don't have any tests for the issues mentioned in this thread. (If someone wants to add them I would be grateful :) https://github.com/NathanWarden/godot-test-tools

cart commented 6 years ago

Yeah I'm guessing that the core basis code is fine. I'm guessing this is specific to the get_euler() implementation. Its been awhile since I took linear algebra but I can take a look if you don't have time.

Good luck with the interview!

neikeq commented 6 years ago

I will have a look tomorrow.

cart commented 6 years ago

I believe I have the fix. It looks like marshaling order was never a bug. The C# basis class sets values in the same order as C++ and the order was not changed when crossing the language boundary. The problem in issue #11901 was because of a missing negative sign in the Basis.GetEuler() implementation.

After reverting the marshaling changes and adding the negative sign I observed the expected behavior: 1) My demo project: objects line up when copying transform or using GetEuler 2) The project in issue #11901 shows the same results for gdscript and c# 3) My game (High Hat), which was previously broken by #16326 now behaves as expected

I'll try running @NathanWarden's unit test suite to get another datapoint.

I'm queuing up a pr now. It would be great to get a second opinion, but my confidence is this fix is relatively high.

cart commented 6 years ago

Hmm I didn't have the context of @reduz's comments here: https://github.com/godotengine/godot/issues/14553

The changes I outlined bring parity with C++, but it sounds like what we really want is to undo transposing done in C++.

This will require a larger overhaul of Basis.cs and I think a removal of GetAxis(int index).

I propose we merge my changes to unbreak master (with the side effect of requiring basis.GetAxis(0) instead of basis.x), then revert the C++ related optimizations in Basis.cs for code clarity / fixing direct x,y,z axis access / GDScript api clarity parity.

cart commented 6 years ago

In parallel with the Basis.cs rework we would reapply @NathanWarden 's marshaling / unmarshaling changes to transpose the values passed across the language boundary.

The end result being: 1) Basis.cs assumes all inputs and outputs are in this order

[x0, x1, x2]
[y0, y1, y2]
[z0, z1, z2]

2) All Basis.GetAxis(AXIS) calls are replaced with Basis.AXIS: 3) CS->C++ includes transposing. C++ -> CS includes transposing 4) C++ remains unchanged. It assumes values are stored in the following order:

[x0, y0, z0]
[x1, y1, z1]
[x2, y2, z2]

This is actually a bit confusing because the C++ basis constructor tasks raw values as (xx, xy, xz, yx, yy, yz, ...) and sets them in order. But they are stored / read transposed so these are actually (xx, yx, zx, xy, yy, zy ... )

If that all sounds good / correct, I'm happy to do the overhaul.

cart commented 6 years ago

The short term fix PR is out

I ran @NathanWarden's unit tests and they all failed on my changes :) This is actually expected as they directly reference basis.[x,y,z]

neikeq commented 6 years ago

Why not simply rename them to _x, _y and _z, make them private and add x, y and z properties that call GetAxis? AFAIK, this is what GDScript does.

cart commented 6 years ago

That would definitely fix it at an api level. We should probably do that in the short term. Long term a rewrite would probably be better so we can cut out the unnecessary abstraction.

I can add that to my change set if everyone agrees

NathanWarden commented 6 years ago

Thanks @cart for looking into fixing this :) This (and client work, haha) has kept me from playing with 3D more in Godot.

cart commented 6 years ago

@neikeq whats the rationale behind lowercase x, y, z? Can we switch them to upper case to match the rest of the API? Same goes for Vectors. This would be a breaking change but I think its worth it. We are already breaking the current behavior with these fixes.

Pulling in @hpvb, the stability master, for his opinion on the matter.

cart commented 6 years ago

I figured I'd just rework Basis.cs to store / use everything transposed. I'm nearing completion of that, but i have a quick question for @reduz and @neikeq about the determinant calculation used when finding the inverse.

float[] co = new float[3]
{
    inv[1, 1] * inv[2, 2] - inv[1, 2] * inv[2, 1],
    inv[1, 2] * inv[2, 0] - inv[1, 0] * inv[2, 2],
    inv[1, 0] * inv[2, 1] - inv[1, 1] * inv[2, 0]
};

float det = inv[0, 0] * co[0] + inv[0, 1] * co[1] + inv[0, 2] * co[2];

This makes sense (use the first column row for calculating the determinant). The cofactors are correct. However the determinant calculation looks wrong. We need to take into account the +, - checkerboard pattern:

+ - +
- + -
+ - +

Shouldn't this calculation be:

float det = inv[0, 0] * co[0] - inv[0, 1] * co[1] + inv[0, 2] * co[2];

Notice the second cofactor is multiplied by -1.

matrix3.cpp uses the same approach:

real_t co[3] = {
    cofac(1, 1, 2, 2), cofac(1, 2, 2, 0), cofac(1, 0, 2, 1)
};
real_t det = elements[0][0] * co[0] +
                elements[0][1] * co[1] +
                elements[0][2] * co[2];

This seems like a bug that was propagated from c++ to c#. I'll fix it if I can get a confirmation.

cart commented 6 years ago

Aah I see whats happening. The order of inv[1, 2] * inv[2, 0] - inv[1, 0] * inv[2, 2] is reversed, which is the same as multiplying by -1. The same thing is done in the adjugate matrix.

But the current implementation still has a problem

inv = new Basis
(
    co[0] * s,
    inv[0, 2] * inv[2, 1] - inv[0, 1] * inv[2, 2] * s,
    inv[0, 1] * inv[1, 2] - inv[0, 2] * inv[1, 1] * s,
    co[1] * s,
    inv[0, 0] * inv[2, 2] - inv[0, 2] * inv[2, 0] * s,
    inv[0, 2] * inv[1, 0] - inv[0, 0] * inv[1, 2] * s,
    co[2] * s,
    inv[0, 1] * inv[2, 0] - inv[0, 0] * inv[2, 1] * s,
    inv[0, 0] * inv[1, 1] - inv[0, 1] * inv[1, 0] * s
)

every instance like this:

inv[0, 2] * inv[2, 1] - inv[0, 1] * inv[2, 2] * s,

should really be

(inv[0, 2] * inv[2, 1] - inv[0, 1] * inv[2, 2]) * s,
cart commented 6 years ago

I put together what I thought Basis.cs should look like given my understanding of the public godot api and linear algebra. Its available here. This branch uses @NathanWarden's marshaling fix because we should need to transpose when converting from c++ to C#.

This is the Basis.cs / godot api layout

[x1, x2, x3]
[y1, y2, y3]
[z1, z2, z3]

However it still doesn't behave consistently with gdscript.

I dug into matrix3.cpp and doesn't seem to access elements consistently.

For example set_axis_angle sets the angles assuming this internal layout:

[x1, x2, x3]
[y1, y2, y3]
[z1, z2, z3]

This layout seems inconsistent with being "internally transposed".

But then get_axis and get_scale retrieves items assuming this internal layout

[x1, y1, z1]
[x2, y2, z1]
[x3, y3, z3]

This layout seems consistent with being "internally transposed".

At this point I would love some feedback because I don't want to make sweeping/breaking changes to the c++ side until I know I'm right / people want the changes

Its super late here and I am very "mathed out". So I'm going to call it a night for now. Feel free to tell me where I messed up while I'm sleeping :smile:

tagcup commented 6 years ago

They assume the same layout, which is the typical text-book layout

M11 M12 M13 M21 M22 M23 M31 M32 M33

And elements[i+1][j+1] refers to Mij.

get_axis and get_scale refer to the columns of basis matrix, not rows, because columns are the basis vectors of the transformation, not rows (you can check that M.(1,0,0) gives you the first column, not row)

If your notation xi yi zi refers to those basis vectors, then the top one is correct in your post.

tagcup commented 6 years ago

Note that some people equivalent write the basis matrix as

Mxx Mxy Mxz Myx Myy Myz Mzx Mzy Mzz

The x y z here don't refer to basis axes they're just labels replacing 1 2 3

tagcup commented 6 years ago

Another note, on paper, matrices are written "row-major": the first index of Mij refers to the ith row. Basis implementation is also like that: elements[i][j] refers to the ith row so it's more natural in that regard. In math code, it is the 2D transform is which is transposed and thus always awkward to work with.

tagcup commented 6 years ago

So I'd recommend against converting Basis to "untransposed" because that would be transposed and confusing.

The only upside of storing the Basis in a transposed form (what you're suggesting to do) is get_axis implementation will be simpler, but that's like burning down an entire forest to remove a single dead tree trunk that bothers you. You'd be introducing a very confusing convention to the class and all of its methods that strays form math literature for a minor convenience of simplifying the implementation of a single function.

If anything, I'd recommend actually untransposing Transform2D.

tagcup commented 6 years ago

Anyhow the current C++ code is fine, self consistent and also consistent with "on paper" math. @reduz also mentioned it's overall faster that way.

cart commented 6 years ago

Thanks for the rundown! That (1) makes sense (2) makes me feel much better about the code (3) explains why everything works as expected in its current state :smile:

I figured I was missing something. The one thing I'm a bit unclear on is reduz's comment from #14553

Bound scripting langs should not imitate the C++ API, which is made the current way for faster vector-matrix multiplication.

GDScript already exposes the vectors as transposed.

I (now) assume he is referring to x, y, z being the rows (in C++) instead of columns (in GDScript / C#)?

cart commented 6 years ago

If the internal layout is currently faster and in line with math literature, why imply that it is somehow wrong and that the alternative would be slower?

tagcup commented 6 years ago

Nope, those vectors in GDScript are the columns.

If you're referring to what reduz is saying, he's probably referring with respect to a common implementation or the 2D code. In general though, a few possible explanations 1) programmers aren't typically mathematicians 2) most just copy-paste math code without thinking about it much 3) some people just store matrices in column major format because despite being awkward it happens to be faster in their use case. I'm sure there're other reasons.

hpvb commented 6 years ago

I think this has been fixed in 3.0 and master for a while now. Can someone please verify that this can be closed now?

mysticfall commented 6 years ago

Yes, it was fixed with the relevant commit. I was wondering why it is still open myself :)

alexwbc commented 5 years ago

This problem is still present to me using Godot v3.1 beta3. Found while was working on a sample project: https://gitlab.com/alexwbc/tpsharp (press [enter] to switch the code between GDScript and C#)