Rynchodon / ARMS

ARMS mod for Space Engineers
Creative Commons Zero v1.0 Universal
18 stars 17 forks source link

Server crashes when certain players join #60

Closed zrisher closed 9 years ago

zrisher commented 9 years ago

Ever since this morning's update, whenever either of two specific players join my server it crashes as soon as they get to the second loading screen. I believe they may both have/ be near autopiloted ships, in contrast to the people who are able to join just fine. This appears in the main log:

2015-06-21 14:08:20.080 - Thread:   6 ->  World request received: Raideur Ng
2015-06-21 14:08:20.080 - Thread:   6 ->  ...responding
2015-06-21 14:08:26.242 - Thread:   6 ->  GC Memory: 311,380,472 B
2015-06-21 14:08:56.257 - Thread:   6 ->  GC Memory: 313,428,544 B
2015-06-21 14:09:17.323 - Thread:   6 ->  Exception occured: System.InvalidOperationException: Collection was modified; enumeration operation may not execute.
   at System.Collections.Generic.List`1.Enumerator.MoveNextRare()
   at Sandbox.Engine.Multiplayer.MyTransportLayer.SendMessage[TMessage](TMessage& msg, List`1 recipients, MyTransportMessageEnum messageType, Boolean includeSelf)
   at Sandbox.Game.Multiplayer.MySyncLayer.SendMessageToRecipients[TMsg](TMsg& msg, MyTransportMessageEnum messageType, Boolean includeSelf)
   at Sandbox.Game.Multiplayer.MySyncGlobal.SendSimulationInfo()
   at Sandbox.Engine.Multiplayer.MyMultiplayerBase.Tick()
   at Sandbox.Engine.Multiplayer.MyDedicatedServer.Tick()
   at Sandbox.Game.World.MySession.Update(MyTimeSpan updateTime)
   at Sandbox.MySandboxGame.Update()
   at Sandbox.Engine.Platform.Game.UpdateInternal()
   at Sandbox.Engine.Platform.FixedRenderLoop.<>c__DisplayClass2.<Run>b__1()
   at Sandbox.Engine.Platform.GenericRenderLoop.Run(VoidAction tickCallback)
   at Sandbox.Engine.Platform.FixedRenderLoop.Run(VoidAction tickCallback)
   at Sandbox.MySandboxGame.Run(Boolean customRenderLoop, Action disposeSplashScreen)
   at VRage.Dedicated.DedicatedServer.RunInternal()
   at VRage.Dedicated.DedicatedServer.RunMain(String instanceName, String customPath, Boolean isService, Boolean showConsole)
   at VRage.Dedicated.WindowsService.MainThreadStart(Object obj)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart(Object obj)

If I disable Autopilot, they're able to join without issue.

Sorry for not grabbing the autopilot log, I thought this was an issue with another mod until I realized that. If this isn't an obvious fix I can have those players jump on a test world later tonight or tomorrow to suss this out.

Rynchodon commented 9 years ago

Unfortunately when the Exception is thrown in the game thread it doesn't provide much information. My first thought is to disable Weapon Control, if that works it will at least give me an idea of where the issue is.

The regular Autopilot log probably wouldn't help, unless this issue is being caused by an Exception somewhere else. The dev version may or may not log enough to shed some light on this.

zrisher commented 9 years ago

So I ran the server with autopilot & weapon control (no radar) enabled for a couple hours today, had one crash (where I failed to get the logs in time, grrrr) but also an hour-long period where it didn't crash but we had very poor server performance (13:30 - 14:50, but the period before that had Autopilot on too):

capture

As you can see, memory use increased by about 10 mb per minute. Normally our server memory use stays between 350-700mb, but you can see here it got up to 2.7 gigs in a very short period of time. CPU also seemed to be abnormally taxed.

I WAS able to grab the logs for that period, and I see a number of errors coming out of WeaponControl thread updates and CubeGridCache: http://pastebin.com/yjtW2gNk

For your reference, here's the same length of time, directly afterwards, same amount of activity, without autopilot mod:

image

Rynchodon commented 9 years ago

As you can see, memory use increased by about 10 mb per minute. Normally our server memory use stays between 350-700mb, but you can see here it got up to 2.7 gigs in a very short period of time.

Obviously there is a memory leak somewhere...

CPU also seemed to be abnormally taxed.

The pathfinder is not efficient.

zrisher commented 9 years ago

Yes, but CPU use seemed to climb over time just like memory. I'm saying there maybe be a single problem responsible for the memory leak, the CPU "leak", and the error messages.