konsolas / AAC-Issues

AAC Issue Tracker.
32 stars 15 forks source link

Server crashes NoSuchElementException in UnsafeList$Itr.next #551

Closed MedicOP closed 7 years ago

MedicOP commented 7 years ago

I've started experiencing sporadic crashes on many servers after installing AAC. Some of these servers have not been touched in years besides installing AAC. Many of these servers have never crashed before, which leads me to strongly believe AAC is the cause of them.

There are two different crashes, but they are both caused by the same NoSuchElementException in UnsafeList.java, while iterating through Spigot's entityslice list.

Is AAC somehow messing with entities / entityslices asynchronously or modifying the list in the same tick?

---- Minecraft Crash Report ----
// I blame Dinnerbone.

Time: 4/2/17 12:47 AM
Description: Exception ticking world entities

java.util.NoSuchElementException
    at org.bukkit.craftbukkit.v1_8_R3.util.UnsafeList$Itr.next(UnsafeList.java:248)
    at org.spigotmc.ActivationRange.activateChunkEntities(ActivationRange.java:148)
    at org.spigotmc.ActivationRange.activateEntities(ActivationRange.java:131)
    at net.minecraft.server.v1_8_R3.World.tickEntities(World.java:1400)
    at net.minecraft.server.v1_8_R3.WorldServer.tickEntities(WorldServer.java:597)
    at net.minecraft.server.v1_8_R3.MinecraftServer.B(MinecraftServer.java:786)
    at net.minecraft.server.v1_8_R3.DedicatedServer.B(DedicatedServer.java:374)
    at net.minecraft.server.v1_8_R3.MinecraftServer.A(MinecraftServer.java:654)
    at net.minecraft.server.v1_8_R3.MinecraftServer.run(MinecraftServer.java:557)
    at java.lang.Thread.run(Thread.java:745)
---- Minecraft Crash Report ----
// I just don't know what went wrong :(

Time: 4/3/17 8:28 PM
Description: Ticking block entity

java.util.NoSuchElementException
    at org.bukkit.craftbukkit.v1_8_R3.util.UnsafeList$Itr.next(UnsafeList.java:248)
    at net.minecraft.server.v1_8_R3.Chunk.a(Chunk.java:930)
    at net.minecraft.server.v1_8_R3.World.a(World.java:2569)
    at net.minecraft.server.v1_8_R3.TileEntityHopper.a(TileEntityHopper.java:543)
    at net.minecraft.server.v1_8_R3.TileEntityHopper.a(TileEntityHopper.java:373)
    at net.minecraft.server.v1_8_R3.TileEntityHopper.m(TileEntityHopper.java:193)
    at net.minecraft.server.v1_8_R3.TileEntityHopper.c(TileEntityHopper.java:177)
    at net.minecraft.server.v1_8_R3.World.tickEntities(World.java:1488)
    at net.minecraft.server.v1_8_R3.WorldServer.tickEntities(WorldServer.java:597)
    at net.minecraft.server.v1_8_R3.MinecraftServer.B(MinecraftServer.java:786)
    at net.minecraft.server.v1_8_R3.DedicatedServer.B(DedicatedServer.java:374)
    at net.minecraft.server.v1_8_R3.MinecraftServer.A(MinecraftServer.java:654)
    at net.minecraft.server.v1_8_R3.MinecraftServer.run(MinecraftServer.java:557)
    at java.lang.Thread.run(Thread.java:745)

AAC version: Happens in both 3.0.x and 3.1.x. Server version: Happens in both Spigot-1.8.8 and PaperSpigot-1.8.8 ProtocolLib: 4.2.0 ViaVersion: 1.0.3

konsolas commented 7 years ago

An entity is spawned and then despawned in the same tick, but only on server startup - I'm not sure if this would cause a problem.

MedicOP commented 7 years ago

Could you give me some more details about this entity? Is it spawned at a specific location?

Because it could be possible that the crash happens when a player enters the specific chunk that this entity was located in.

Janmm14 commented 7 years ago

as far as i know its spawned at 0 0 0 to prevent the entity id of a player being 0

MedicOP commented 7 years ago

Okay, guess that's not the problem then.

Does AAC access Chunk.entitySlices anywhere in the code, or use any NMS methods that modify this list?

Normally it would throw ConcurrentModificationException if this were to happen, but it appears that UnsafeList does not support this. (AbstractList.modCount is never incremented anywhere in this class)

This NoSuchElementException should be treated as a ConcurrentModificationException

konsolas commented 7 years ago

There is no reference to "Chunk" anywhere in code.

konsolas commented 7 years ago
            Entity ent = world.spawnEntity(new Location(world, 0, 256, 0), entityType);
            WrappedDataWatcher dataWatcher = WrappedDataWatcher.getEntityWatcher(ent).deepClone();
            dataWatcher.setObject(0, (byte) ((byte) (dataWatcher.getObject(0)) | 0x20), false);
            DATA_WATCHER_MAP.put(entityType, dataWatcher.deepClone());
            ent.remove();

This is called onEnable for several different EntityTypes.

MedicOP commented 7 years ago

It's most likely not the entity spawning thing. I suspect it's caused by concurrent modification of the entityslices list.

Does AAC teleport players in any shape or form asynchronously?

This one seems more likely: Does AAC kick/disconnect players asynchronously in any way?

konsolas commented 7 years ago

I sent a debug build to @CraftRise with the killaura check removed (along with the entity spawning), and async catchers for all direct interactions with NMS. Despite this, crash still occurred.

I'll do a full search of my code for teleport() and kickPlayer()

konsolas commented 7 years ago

Only two instances of kickplayer are called by onCommand.

Teleportation is primarily done by setting the to location of PlayerMoveEvent. In the Fly check, vehicle flying is handled with player#teleport - this occurs sync, inside PlayerMoveEvent The Velocity check also calls player#teleport, but schedules a sync task for this purpose The admin GUI also uses player#teleport

There are no other instances of teleportation.

MedicOP commented 7 years ago

kickPlayer has an async catcher anyway, but I still suspect PlayerList.disconnect() is being called somehow from an asynchronous thread.

Unlikely, but does AAC run call CommandSender.dispatchCommand / Player.performCommand / Player.chat asynchronously anywhere?

The only other thing I can think of is PlayerConnection.java:960, which seems to disconnect a player asynchronously if the chat packet contains an invalid character and does not start with "/".

konsolas commented 7 years ago

Bukkit.getServer().dispatchCommand(Bukkit.getConsoleSender(), "command from config file")

is called. I'll check whether it's async

konsolas commented 7 years ago

It's only called via the scheduler, which is sync.

konsolas commented 7 years ago

Just found that the crash still occurs with only speed, fly, fastplace, fastbreak, fastuse, interact, phase enabled.

MedicOP commented 7 years ago

Have you found what triggers the crash, or if it happens without AAC installed?

Could this have anything to do with it?

The only other thing I can think of is PlayerConnection.java:960, which seems to disconnect a player asynchronously if the chat packet contains an invalid character and does not start with "/".

Could test it using a client mod or some kind of mobile chatting app that does not block invalid characters

MedicOP commented 7 years ago

This is the code used to check if a character is valid

public static boolean isAllowedChatCharacter(char c0) {
    return c0 != 167 && c0 >= 32 && c0 != 127;
}
MedicOP commented 7 years ago

Although I don't see any records in the logs of players being kicked for "Illegal characters in chat", idk

MedicOP commented 7 years ago

Oh nvm, I had the search function in Notepad++ set to regex. I do see players being kicked for "Illegal characters in chat", so it is definitely possible

konsolas commented 7 years ago

I still don't know what triggers the crash. GommeHD has had no similar crash, just confirmed by @geNAZt

I think I need to try a different approach. @MedicOP since you are able to reproduce this, could you please tell me the first AAC version which started crashing? Has this crash occurred since 3.0.0, or did it only start happening with later versions?

Maybe I can look at git diffs to see what changed.

MedicOP commented 7 years ago

Unfortunately I didn't start using AAC until 3.0.5-b1, so I have no idea if it started occurred at a specific version. I am pretty sure it happened in both 3.0.5 and 3.1.x though

konsolas commented 7 years ago

Are you completely sure this occurred with 3.0.5?

MedicOP commented 7 years ago

I'm not 100% sure, and I have no way to check because I deleted the crash-reports directory a week ago :(

MedicOP commented 7 years ago

Someone reported it on AAC's resource discussion on March 21st: https://www.spigotmc.org/threads/aac-advanced-anti-cheat-hack-kill-aura-blocker-paid.63195/page-394#post-2312823

The post right after it shows that he's using 3.0.5

konsolas commented 7 years ago

I'm currently looking into getNearbyEntities, as it's called async several times. I'm not sure if this would cause the sort of crash that we're looking for, though.

MedicOP commented 7 years ago

Here's a post I just found on Spigot Forums showing same error. Plugin list he provided contains AAC, so it's definitely looking like it's an AAC bug

https://www.spigotmc.org/threads/server-crash-error.232023/

MedicOP commented 7 years ago

Calling getNearbyEntities asynchronously is definitely not a good idea, CME is bound to happen (NoSuchElementException in this case, as UnsafeList does not support CME). For example it happened in the plugin Kingdoms:

09.06 18:00:41 [Server] WARN Exception in thread "Craft Scheduler Thread - 21"
09.06 18:00:41 [Server] WARN org.apache.commons.lang.UnhandledException: Plugin Kingdoms v8.2 generated an exception while executing task 10005
09.06 18:00:41 [Server] INFO at org.bukkit.craftbukkit.v1_9_R2.scheduler.CraftAsyncTask.run(CraftAsyncTask.java:56)
09.06 18:00:41 [Server] INFO at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
09.06 18:00:41 [Server] INFO at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
09.06 18:00:41 [Server] INFO at java.lang.Thread.run(Thread.java:745)
09.06 18:00:41 [Server] INFO Caused by: java.util.NoSuchElementException
09.06 18:00:41 [Server] INFO at org.bukkit.craftbukkit.v1_9_R2.util.UnsafeList$Itr.next(UnsafeList.java:248)
09.06 18:00:41 [Server] INFO at net.minecraft.server.v1_9_R2.Chunk.a(Chunk.java:832)
09.06 18:00:41 [Server] INFO at net.minecraft.server.v1_9_R2.World.getEntities(World.java:2450)
09.06 18:00:41 [Server] INFO at org.bukkit.craftbukkit.v1_9_R2.CraftWorld.getNearbyEntities(CraftWorld.java:761)
09.06 18:00:41 [Server] INFO at org.kingdoms.constants.land.Turret$2.run(Turret.java:274)
09.06 18:00:41 [Server] INFO at org.bukkit.craftbukkit.v1_9_R2.scheduler.CraftTask.run(CraftTask.java:71)
09.06 18:00:41 [Server] INFO at org.bukkit.craftbukkit.v1_9_R2.scheduler.CraftAsyncTask.run(CraftAsyncTask.java:53)
09.06 18:00:41 [Server] INFO ... 3 more

No idea how that translates to the entire server crashing though, as opposed to just a "Task 69 for AAC generated an exception"

konsolas commented 7 years ago

That's what I was wondering.

konsolas commented 7 years ago

@MedicOP Are you able to do some debugging? Could you enable checks in AAC one by one on a consistently crashing server to find out which check causes AAC to crash?

MedicOP commented 7 years ago

It doesn't crash often enough to do that. It's a different server every time and like one a day on average

konsolas commented 7 years ago

So it turns out that if getNearbyEntities results in a chunk being loaded, a crash could (theoretically) be created as the chunk loading loads the entityslice, etc, etc etc.

I'm not sure how, but would it be possible for you to confirm this is what happens?

MedicOP commented 7 years ago

Not sure how I would test that. Make a plugin that spawns an entity in an unloaded chunk and try calling getNearbyEntities on it asynchronously? lol

konsolas commented 7 years ago

Alternatively, I send you a pre-release of 3.1.4, you tell me if your server breaks?

konsolas commented 7 years ago

Currently synchronizing all the getNearbyEntities.

MedicOP commented 7 years ago

What other things are new in 3.1.4? If it's stable then I don't mind testing it

andrewkm commented 7 years ago

@konsolas Looks like I'm getting this exact issue lately. Made another issue for it here, not sure how you want it documented: https://github.com/konsolas/AAC-Issues/issues/681

andrewkm commented 7 years ago

Linking this from md_5 as well if it's any help: http://i.imgur.com/vkQ6eV0.png

Janmm14 commented 7 years ago

Using new issue for further discussion