Closed SamTheBay closed 3 months ago
I had this exact same issue on 1.7.1, 1.8.1, 1.9.1, and 1.10.0. I just removed the entire networkshow/networkhide stuff from my codebase, however I would like to add it back eventually
@SamTheBay Yeah that does seem like a bug indeed...and we should have some form of protection against that. In order to better understand how you ended up with this bug, could you explain further why you would want to do this?
testNetItem.NetworkShow(testPlayer.OwnerClientId);
testNetItem.NetworkObject.Despawn(true);
My confusion with that section of code is:
Looking at the code base I think I can see how this could cause problems, but I am more interested in why you would want to do this/what is it you are trying to accomplish performing these specific actions in that specific order?
Also, could you provide the stack trace to the exception?
My first pass thought is something like:
internal void HandleNetworkObjectShow()
{
// Handle NetworkObjects to show
foreach (var client in ObjectsToShowToClient)
{
ulong clientId = client.Key;
foreach (var networkObject in client.Value)
{
if (networkObject != null && networkObject.IsSpawned)
{
try
{
SendSpawnCallForObject(clientId, networkObject);
}
catch
{ }
}
}
}
ObjectsToShowToClient.Clear();
}
Which would:
NetworkObject
was nullNetworkObject
was no longer spawnedOnSynchronize
or the like), it will (at a minimum) continue to process the list and eventually clear out the show object list to prevent the "forever showing" issue by catching any exception that could occur when creating the CreateObjectMessage
.This I could put into place pretty quickly... the test to validate will take way longer than it did to put some protection around that portion of the code. 😸
Thanks for the follow up! In game I am not doing exactly what is shown, that is just code to create a simple repro. The game is an MMO style architecture, so the server is hosting many "levels" and the players can move between them freely. I use hide and show to only make the level the player is currently in visible to their client. The levels also have timers on them at which point they expire and are destroyed. So, there is a race condition where a player could choose to join a level right at the moment the timer expires, and the server decides to destroy it. The server logic will eventually detect this and send the player back to "town" but by that time it is too late because it will enter this death loop. I have added special protection against this by keeping track of when the last Show call was and then hiding the object on despawn which looks to be working as a mitigation (but is wasteful with regards to performance).
Yes, I agree your solution looks good. Here is the exception callstack...
NullReferenceException
at (wrapper managed-to-native) UnityEngine.Component.get_transform(UnityEngine.Component)
at Unity.Netcode.NetworkObject.GetMessageSceneObject (System.UInt64 targetClientId) [0x00082] in .\Library\PackageCache\com.unity.netcode.gameobjects@1.8.1\Runtime\Core\NetworkObject.cs:1765
at Unity.Netcode.NetworkSpawnManager.SendSpawnCallForObject (System.UInt64 clientId, Unity.Netcode.NetworkObject networkObject) [0x0000d] in .\Library\PackageCache\com.unity.netcode.gameobjects@1.8.1\Runtime\Spawning\NetworkSpawnManager.cs:764
at Unity.Netcode.NetworkSpawnManager.HandleNetworkObjectShow () [0x0003b] in .\Library\PackageCache\com.unity.netcode.gameobjects@1.8.1\Runtime\Spawning\NetworkSpawnManager.cs:1124
at Unity.Netcode.NetworkBehaviourUpdater.NetworkBehaviourUpdater_Tick () [0x00008] in .\Library\PackageCache\com.unity.netcode.gameobjects@1.8.1\Runtime\Core\NetworkBehaviourUpdater.cs:128
at (wrapper delegate-invoke)
(Filename: ./Library/PackageCache/com.unity.netcode.gameobjects@1.8.1/Runtime/Core/NetworkObject.cs Line: 1765)
I use hide and show to only make the level the player is currently in visible to their client. The levels also have timers on them at which point they expire and are destroyed. So, there is a race condition where a player could choose to join a level right at the moment the timer expires, and the server decides to destroy it. The server logic will eventually detect this and send the player back to "town" but by that time it is too late because it will enter this death loop.
Ahh... I think I understand the scenario better. Based on the full call stack (thank you for that), it does look like the first pass fix should do the trick. I will get a PR up for this by tomorrow.
👍
Description
If you call NetworkShow() on a NetworkObject and then Despawn(true) before the network tick runs it gets the server into a livelock state that requires it to be restarted. This is because the NetworkObject is added to a queue in the SpawnManager which will be executed on the next network tick. However, the NetworkObject is destroyed by the time the next network tick occurs. When the SpawnManager tries to show the NetworkObject it attempts to reads the transforms parent, which throws and exception since the NetworkObject has been destroyed already. The SpawnManager will repeat attempting to show the destroyed NetworkObject on every network tick from then on since it is never removed from the queue, thus livelocking the server.
Reproduce Steps
Code like this will reproduce it...
Note: You must spread out the NetworkHide from the NetworkShow by at least one network tick. Otherwise, it will not repro because there is logic to protect against trying to hide and show in the same frame.
Actual Outcome
Server gets into a stuck state in which no new objects are able to spawn for clients.
Expected Outcome
Server should discard the destroyed item from its spawn queue since it is no longer a valid action.
Environment
Additional Context
I seem to be able to work around the issue by creating my own class SafeNetworkBehaviour and keeping track of the last time a NetworkShow() call was made. Then, in the OnNetworkDespawn() method I check if NetworkShow was called recently. If it was, I go through and call NetworkHide on the recently shown clientIds. This will remove it from the SpawnManager queues before the NetworkObject is destroyed. However, this seems super hacky and it would be better if the networking code handled this by default.