Unity-Technologies / com.unity.netcode.gameobjects

Netcode for GameObjects is a high-level netcode SDK that provides networking capabilities to GameObject/MonoBehaviour workflows within Unity and sits on top of underlying transport layer.
MIT License
2.09k stars 431 forks source link

Network Transform synchronized wrong position when WorldPositionStays is false #2888

Closed AsmPrgmC3 closed 1 month ago

AsmPrgmC3 commented 2 months ago

Description

When a client joins a server the initial synchronized positions of Network Transforms are wrong, if

The problematic code path I was able to determine:

  1. The Network Object sets m_CachedWorldPositionStays to false because it has a parent without Network Object: https://github.com/Unity-Technologies/com.unity.netcode.gameobjects/blob/d97d968a47a9e979ac57ac8b9320f63f38cf7d78/com.unity.netcode.gameobjects/Runtime/Core/NetworkObject.cs#L1266-L1278
  2. When joining the server side Network Transform synchronizes the local position based on that (but doesn't set networkState.InLocalSpace): https://github.com/Unity-Technologies/com.unity.netcode.gameobjects/blob/d97d968a47a9e979ac57ac8b9320f63f38cf7d78/com.unity.netcode.gameobjects/Components/NetworkTransform.cs#L1719-L1729
  3. The client Network Transform applies the transmitted position based on newState.InLocalSpace (which is false): https://github.com/Unity-Technologies/com.unity.netcode.gameobjects/blob/d97d968a47a9e979ac57ac8b9320f63f38cf7d78/com.unity.netcode.gameobjects/Components/NetworkTransform.cs#L2314-L2321

Reproduce Steps

  1. Create an empty called "parent"
  2. Set the local position of "parent" to [1, 1, 1]
  3. Add a cube as a child of "parent"
  4. Add Network Object and Network Transform Scripts to the cube
  5. Start the server and have a client join it (joining a host also works, but the issue doesn't happen on the host itself)

Actual Outcome

The cube on the client spawns at the origin (local position [-1, -1, -1])

Expected Outcome

The cube on the client spawns at the server position of [1, 1, 1] (local position [0, 0, 0])

Environment

aidanhorton commented 1 month ago

Can confirm that this is the issue me and another user described at the bottom of this issue thread too: https://github.com/Unity-Technologies/com.unity.netcode.gameobjects/issues/2826

This is stopping us from upgrading to NGO 1.8.0+

AsmPrgmC3 commented 1 month ago

@aidanhorton Our workaround was to enable "In Local Space" in the Network Transforms as the parents never change or move.

Another option might be to fork the package as a fix for this should be to just remove the second code snippet. Though I don't know whether other bugs might be lurking.

NoelStephensUnity commented 1 month ago

@aidanhorton @AsmPrgmC3 So when parenting in-scene placed NetworkObjects under a GameObjects there is an underlying issue that needed to be considered since the parent GameObject's transform is not synchronized if you plan on using a NetworkTransform.

Note: Scroll down to very bottom to get the short answer to solving this issue but if you are interested in the why continue reading

The first scenario is just a simple parent (GameObject) that has a child (NetworkObject) and there is no NetworkTransform component. With the default settings applied the in-scene placed parenting rules are followed as expected and the child is treated as if it was in local space: image

When you add a NetworkTransform component into the mix you are now implying "something else is handling transform synchronization" which the rules of the NetworkTransform component take over. We are aware that this is one of the higher "user pain" areas and have plans to be better able to handle detecting a NetworkTransform component at the runtime SDK level in NGO v2.0.0, but this requires that we migrate the components assembly into the runtime which, due to versioning policies, we can't do within v1.x.x. With v2.0.0 since NetworkTransform will be a "known" class/component to the runtime, we will be able to make all of this more "automated" and assure that NetworkObject transform synchronization doesn't impact NetworkTransform transform synchronization.

Assuming you are going to stick with v1.x.x for the time being, I can help you navigate handling this scenario where you have an in-scene placed child NetworkObject with NetworkTransform component parented under a GameObject and you want to keep your NetworkTransform in world space.

Take this scenario into consideration: image

Above there are two parents (both spheres) that both have two children and both parents shows that one of their children looks visually accurate while 1 child does not.

The first Parent (at 1,1,1) has two NetworkObject children (with NetworkTransforms) where both children have auto object parent sync enabled (the default). One child's NetworkTransform is in world space (pink) and the other is in local space (purple): image The pink child's NetworkTransform is set to synchronize its position in world space, which if its position is 0,0,0 then it would position itself in the world space position coordinates of (0,0,0). We can see that it is doing exactly that (while remembering the NetworkTransform will synchronize either the world space or local space position). The purple child's NetworkTransform is set to synchronize its position in local space, which if its position is 0,0,0 then, relative to its parent, it should be the exact same world space position as its parent (1,1,1). We can see that it is doing exactly that (while remembering the NetworkTransform will synchronize either the world space or local space position).

Unfortunately, this requires that you place the NetworkTransform in "local space" mode which there might be scenarios where you want to handle all of your motion/state updates in world space (especially if the parent GameObject has some kind of offset).

The second parent (at -1, 1, -1) has two children (with NetworkTransforms) where auto object parent sync is disabled: image The light blue child's NetworkTransform is set to synchronize its position in world space, but since the NetworkObject's parent synchronization is disabled the values synchronized are the actual world space values of the child (which if it is local space position was 0,0,0 and is parented under a GameObject at world space -1,1,-1 then the NetworkTransform will see its world space position as -1, 1, -1 which would be the same as its parent). We can see that it is doing exactly that.

The green child's NetworkTransform is set to synchronize its position in local space, but since the NetworkObject's parent synchronization is disabled the "local space" values synchronized end up being the parent's world space values that get synchronized by the NetworkTransform which makes the NetworkTransform's synchronized local space values translate to "world space" (I know this sounds funky). So, when it is being synchronized the NetworkObject applies the world space values of -1,1,-1... which when applied to the transform will offset it from the parent by -1,1,-1 which would make its world space value -2,2,-2. We can see that it is doing exactly that. (this combination is not recommended unless the parent is at world space 0,0,0).

The auto object parent sync setting was to handle several scenarios where one happens to be the special case where you are parenting an in-scene placed NetworkObject (synchronized transform) under something that will not be synchronizing its transform while still following the traditional transform translations when parenting GameObjects and also including a NetworkTransform into the mix. (which the complexity of this is greatly increased due to he separation of the components and runtime assemblies...which I promise will be resolved by v2.0.0).

So, without knowing what your intended usage/context is for parent things under a GameObject (with no NetworkObject) I will just make the assumption that you don't plan on moving, rotating, or scaling the parent GameObject. If so, I would recommend disabling the auto parent sync on the child NetworkObject to get a "local space" initial synchronization while still keeping the child's NetworkTransform's transform state updates "world space" relative.

This should make both the host/server side and client side handle the initial synchronization properly: image

This only requires you to turn off auto object parent sync on the child NetworkObject: image

Warning If you plan on changing the parent of the child NetworkObject at a later time it might just be easier to make the parent have nothing but a NetworkObject on it and let the parent-child transform synchronization get handled automatically... otherwise you will need to re-enable the auto object parent sync property on all instances/clones (i.e. clients and server) prior to reparenting the child NetworkObject(s) under another NetworkObject.

Short Answer Disable the child's Auto Object Parent Sync property.

Let me know if this helps and resolves your issue?

AsmPrgmC3 commented 1 month ago

Disabling Auto Object Parent Sync did indeed fix the issue, thanks!

Though while I don't know what the intended behaviour is, I still think the current behaviour is wrong. First of all it does sound like a hard problem, as the Auto Object Parent Sync and In Local Space settings seem to conflict with each other. As I understand it (and please correct me if I'm wrong) Auto Object Parent Sync and In Local Space both imply local coordinate syncing while disabling those settings implies world coordinate syncing. So it's not clear what should happen if one is on and the other off. My intuition is that the Network Transform should override the Network Object as it is (in my mind) a more elaborate/complex position sync you only add/use when the "default" one provided by the Network Object isn't enough. That is only my intuition though and I don't think there is an objectively correct answer. And depending on the use case there are probably always going to be surprises.

That being said, the current behaviour of one half of the Network Transform respecting the Auto Object Parent Sync setting (the sending side) while the other (the receiving side) doesn't just doesn't seem useful or obvious to me. There should at least be some kind of warning that having both settings differ probably doesn't work as expected (either at runtime or in the inspector).[^1]

Another thing of note is that this only applies to the initial synchronization. If the Network Transform is moved afterwards, the client side cube "teleports" (or interpolates, if enabled) to the same position it is on the server (i.e. uses local/world coordinates on both sides according to the In Local Space setting, ignoring Auto Object Parent Sync / WorldPositionStays). So even if the local/world coordinate mismatch is intended it breaks as soon as the Network Transform is moved (which is kind of the point of the Network Transform).

[^1]: On a side note, while do understand why the defaults are what they are, it seems unfortunate that the default for Auto Object Parent Sync is on, while the default for In Local Space is off. That means that only having a Network Object syncs the same position on server and client side while doing nothing but adding a Network Transform causes the positions to differ until the settings are manually adjusted. But that's a separate topic and not directly related to the current issue.

NoelStephensUnity commented 1 month ago

@AsmPrgmC3 That is good to hear it resolved your issue.

Regarding the "odd" behavior when you add a NetworkTransform into the mix, this all revolves around some technical debt we have that involves the initial layout of where NGO provided components (i.e. NetworkTransform, NetworkRigidbody, etc) were all placed within the Components assembly and that assembly requires a dependency to the NGO Runtime assembly. As such, anything within the Runtime assembly had no visibility to things within the Components assembly... which to try to do otherwise would create a circular dependency reference issue.

So, setting aside the quirky settings... and just thinking about a NetworkObject by itself (with no other components)...any settings that are applied to NetworkObject are only relative to the Runtime assembly scope and no real "easy" way to make a NetworkObject aware of NetworkTransform without adding specialized internal methods to do some registration process (or the like). This all evolved over time with many different updates/adjustments for edge case scenarios that has, admittedly, led to some over-complicated settings.

The best solution to resolve this issue (and many other all focused around transform synchronization, parenting, and such) will be to just migrate all of the components into the Runtime so we can make the proper adjustments in NetworkObject serialization that will adjust what it serializes depending upon whether there is a NetworkTransform component attached or not and how it handles parenting.

So, when a NetworkTransform component is detected on the NetworkObject the entire transform state synchronization will be handled by the NetworkTransform and not the NetworkObject. This adjustment will result in some NetworkObject properties being removed and will also include some other optimizations we are targeting for v2.0.0 (i.e. batching NetworkTransform updates as opposed to having each one update independently). The decision to do this in v2.0.0 is that this overhaul will include some minor breaking changes (i.e. NetworkTransform will no longer implement the Monobehaviour.Update method).

Either case, apologies for this "messy" area within the NGO SDK. It is strictly limited to the scenario where:

This is the "edge case" that didn't have any real "clear cut" solutions to handling this more gracefully without migrating components to runtime...which that requires a major version increment...and today we have a v2.0,0-exp.2 package that provides many opportunities to "clean things up".

To think of this from a different perspective, if you parented a NetworkObject under another NetworkObject you more likely would want to switch the NetworkObject to local space but have the option to keep it in world space. How the parenting is applied (i.e. world vs local) then is completely determined by WorldPositionStays. With this kind of parenting (more common) everything should logically "work as expected". However, this is only possible because you are parenting two NetworkObjects that will both be synchronized (which provides the opportunity to know exactly what needs to be applied because we have enough from both the parent and the child to know what the intended transform spaces are for both. With a NetworkObject parented under a GameObject we only know half of the "equation" (so to speak... your particular intended use is straight forward to you, but if you start thinking about what other combinations other users might want to use...then it becomes a different story).

AsmPrgmC3 commented 1 month ago

@NoelStephensUnity Thanks for the reply, that pretty much all makes sense. The only thing I still fail to understand is why the second code snippet exists: https://github.com/Unity-Technologies/com.unity.netcode.gameobjects/blob/d97d968a47a9e979ac57ac8b9320f63f38cf7d78/com.unity.netcode.gameobjects/Components/NetworkTransform.cs#L1719-L1729

As far as I can see the only thing it does is cause this "odd" behaviour and that the initial synchronization is different from the "regular" (delta) updates/synchronization when moving the transform. Am I missing something? Why is it not removed? What "good" does it do / fix?

NoelStephensUnity commented 1 month ago

As far as I can see the only thing it does is cause this "odd" behaviour and that the initial synchronization is different from the "regular" (delta) updates/synchronization when moving the transform. Am I missing something? Why is it not removed? What "good" does it do / fix?

Thank you for being curious and persisting on this issue! 🥇

You made me do a double take on that area and I am thinking this is indeed a bug...looking at it one more time more closely and just doing a sanity check... I started thinking that most likely it should look like this:

                if (isSynchronization && networkState.IsParented)
                {
                    var parentedUnderGameObject = NetworkObject.transform.parent != null && !parentNetworkObject && NetworkObject.IsSceneObject.Value;                    
                    if (NetworkObject.WorldPositionStays() && (!parentedUnderGameObject || (parentedUnderGameObject && !NetworkObject.AutoObjectParentSync && !InLocalSpace)))
                    {
                        position = transformToUse.position;
                        networkState.InLocalSpace = false;
                    }
                    else
                    {
                        position = transformToUse.localPosition;
                        networkState.InLocalSpace = true;
                    }
                }

This yields this result when running the same quick issue replication project I setup: image

This logical flow seems to make more sense and maintains the same kind of logic used within NetworkObject.ApplyNetworkParenting on the non-authority side of things when applying network parenting.

I am going to be running tests against the above adjustment to make sure it doesn't introduce any other odd behavior and if all looks good will be posting a PR to resolve this issue.

Thank you for your patience and for using NGO! 😸

AsmPrgmC3 commented 1 month ago

Thanks for your work and time!

That logic does look better[^1] but I'm still confused about the why: Why have special logic for the isSynchronization case? What does that do / fix / ensure correctness of?

The way I see/understand it there are two possibilities[^2]:

  1. The position calculated by the Network Transform is always correct. In that case the extra logic is unnecessary, the position is going to be correct anyways.
  2. The position calculated by the Network Transform is incorrect. In that case you will have the correct position on synchronization but wrong ones after moving/teleporting the transform. Since the whole point of Network Transforms is that they're gonna move it just delays the application of the wrong position and makes it harder to notice.

So I still don't understand the purpose of the code / am not convinced of it's necessity.

[^1]: Though I still fear it's (going to be) different from other relevant logic and/or not work in more complex scenarios like re-parenting [^2]: Granted, I'm not very familiar with the synchronization process or how the Network Game Object and Network Transform interact with each other, so that might be wrong

NoelStephensUnity commented 1 month ago

That logic does look better but I'm still confused about the why: Why have special logic for the isSynchronization case? What does that do / fix / ensure correctness of?

Because NetworkTransform can be configured to:

Then (as you mentioned) take into consideration 12 NetworkObjects all in a complex nested hierarchy where each one is configured (potentially) differently and some may or may not have NetworkTransforms...which it would take too much space to write up all of the potential combinations of NetworkTransform configurations vs NetworkObject configurations vs how the complex nested hierarchy is laid out... but the bottom line here is that doing a straight 1:1 direct serialization relative to the parenting (i.e. WorldPositionStays = true or false) while then also taking into consideration the parenting under a non-serialized GameObject... it ends up being much less problematic to handle the very first synchronization (for parented NetworkObjects) this way than to try and calculate the proper values.

Take scale for example. If scale changes on the parent, then the local space position of a child can change when parenting. Scenario: On the authority side you have a parented object where the parent has a scale other than 1, 1, 1: image The parent is at position -2, 2, -2 and the child is at position 1, 2, 5.

Now, you need to create these instances on a client so they would need to be instantiated first and also taking the scenario into consideration where there can be "out of order" serialization (and other scenarios similar to this) where you would need to apply transform values to the child before it is parented.

If you remove the child from the parent (in the editor is a good way to play around with all of this) you will see that in order to keep the same relative world space position the child's scale and position values are impacted by the parent: image

However, those values are not reflective of the values when parented (refer to the 1st image) and as such you need to instantiate and apply transform values in a way that will result in the "same values" as the 1st screenshot but they might be applied prior to being parented...so if you did a direct 1:1 and made them look like this on the client side: image Where if you compare the values between the 1st and the above (3rd) screenshot the values are 1:1. However, if you parent them directly (i.e. world position stays) then you would get this result: image Which does not reflect the values of the authority side (1st screenshot).

So, this is a small "pseudo" example of a more complex series of combinations where it can get really tricky... a lot of these "seemingly quirky or weird" property values being checked help to determine what transform values should be set and a lot of this was due to scenarios where scale could change... and rotation... which rotation can introduce similar "skewed" results depending upon whether you are sending local rotation or world rotation values... and if rotation is not properly set and you have a local space offset from the parent,...the the local space position will be rotated to an "improper" position even if you set the correct local space values...so there is an order of operations that has to happen in regards to what parts of the transform should be updated with incoming synchronization values too. Of course, then add a series of nested objects into this and when/how things are parented becomes a requirement.

The issue exposed in this particular github issue where we are parenting a NetworkObject under a GameObject becomes a little trickier because you only really have 1 part of the equation when piecing things back together on the client side...which obviously is why you can't move a parent in-scene placed GameObject (with no NetworkObject) around on the authority side if you have NetworkObjects nested under it (unless you write specialized code to synchronize those changes, but then you might as well just drop a NetworkObject on it at that point).

I will let the fix run through the tests and then have plans to add some more to better test this scenario (even though we run 14k tests per platform we obviously have some gaps that need to be filled).

AsmPrgmC3 commented 1 month ago

That makes sense. I still don't fully understand the logic fully, but thankfully that's not my job :) It does all seem very complex so thanks for your work and all the time you took to explain things!

One last comment[^1] though: Aren't you setting networkState.InLocalSpace too early? It seems to work in your screenshots but it looks weird to me, as it seems to be reset to the InLocalSpace of the Network Transform in the very next if.

[^1]: hopefully. I really hope I'm not just annoying you with all my questions; please ignore me if I am.

NoelStephensUnity commented 1 month ago

@AsmPrgmC3 😆 Yeah, I have a lot of plates spinning and would have most likely run into that issue when adding tests. Good catch! 🏆

Migrated the InLocalSpace check above the parenting block. 👍

aidanhorton commented 1 month ago

Cheers for this thread and information guys, super informative! I was going to say - my network transforms initial sync means their position is incorrect, but after any update to position, their synchronization is correct - so I was still a little confused! But I can apply this code on my end now since I use a local version of NGO - cheers for the hard work guys, answers all my questions!

NoelStephensUnity commented 1 month ago

@aidanhorton Definitely make sure you get the changes contained in the PR as there were 2 places I ended up adjusting.