spoorn / myLoot

Instanced loot chests for Minecraft Fabric
Other
4 stars 6 forks source link

Deadlock on setBlockState? Seems to be rare #16

Closed spoorn closed 2 years ago

spoorn commented 2 years ago

Testing on my own multiplayer server and noticed a couple of freezes due to a deadlock within a few hours. Seems to be pretty rare, but making note of it here to fix. Might be exacerbated by the huge number of mods in my test modpack

Minecraft 1.18.2 Fabric 0.13.3 myLoot 2.1

Could be related to https://www.curseforge.com/minecraft/mc-mods/feature-nbt-deadlock-be-gone , https://bugs.mojang.com/browse/MC-246262.

Only seems to happen on a multiplayer server, not singleplayer

java.lang.Error: Watchdog
    at java.base@17.0.1/jdk.internal.misc.Unsafe.park(Native Method)
    at java.base@17.0.1/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:252)
    at net.minecraft.class_1255.method_20813(class_1255.java:152)
    at net.minecraft.class_1255.method_18857(class_1255.java:142)
    at net.minecraft.class_3215.getChunkBlocking(class_3215.java:1276)
    at net.minecraft.class_3215.method_12121(class_3215.java:1172)
    at net.minecraft.class_3215.md7983d0$lambda$getChunkOffThread$0$0(class_3215.java:1184)
    at net.minecraft.class_3215$$Lambda$16430/0x0000000802ee3b80.get(Unknown Source)
    at java.base@17.0.1/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
    at net.minecraft.class_1255.method_18859(class_1255.java:157)
    at net.minecraft.class_3215$class_4212.method_18859(class_3215.java:552)
    at net.minecraft.class_1255.method_16075(class_1255.java:131)
    at net.minecraft.class_3215$class_4212.method_16075(class_3215.java:561)
    at net.minecraft.class_1255.method_18857(class_1255.java:140)
    at net.minecraft.class_3215.getChunkBlocking(class_3215.java:1276)
    at net.minecraft.class_3215.method_12121(class_3215.java:1172)
    at net.minecraft.class_1937.method_8402(class_1937.java:187)
    at net.minecraft.class_1937.method_8497(class_1937.java:5098)
    at net.minecraft.class_1937.method_8320(class_1937.java:5590)
    at net.minecraft.class_1937.method_8492(class_1937.java:353)
    at net.minecraft.class_1937.method_8452(class_1937.java:320)
    at net.minecraft.class_3218.method_8408(class_3218.java:1377)
    at net.minecraft.class_1937.method_30092(class_1937.java:235)
    at net.minecraft.class_1937.method_8652(class_1937.java:196)
    at net.minecraft.class_1937.method_8501(class_1937.java:311)
    at org.spoorn.myloot.core.LootableContainerReplacer.lambda$registerTickCallback$0(LootableContainerReplacer.java:98)
    at org.spoorn.myloot.core.LootableContainerReplacer$$Lambda$8846/0x000000080238d810.onEndTick(Unknown Source)
    at net.fabricmc.fabric.api.event.lifecycle.v1.ServerTickEvents.lambda$static$2(ServerTickEvents.java:62)
    at net.fabricmc.fabric.api.event.lifecycle.v1.ServerTickEvents$$Lambda$4129/0x00000008017d3a58.onEndTick(Unknown Source)
    at net.minecraft.server.MinecraftServer.handler$bpg002$onEndTick(MinecraftServer.java:4332)
    at net.minecraft.server.MinecraftServer.method_3748(MinecraftServer.java:889)
    at net.minecraft.server.MinecraftServer.method_29741(MinecraftServer.java:697)
    at net.minecraft.server.MinecraftServer.method_29739(MinecraftServer.java:273)
    at net.minecraft.server.MinecraftServer$$Lambda$13381/0x0000000802aec000.run(Unknown Source)
    at java.base@17.0.1/java.lang.Thread.run(Thread.java:833)

crash-2022-04-06_00.51.24-server.txt

spoorn commented 2 years ago

Confirmed that trying a getBlockState(), removeBlockEntity(), setBlockState() or similar will run into a deadlock if done during world chunk loading. Tested by trying those calls here: https://github.com/spoorn/myLoot/blob/7737b36caff30b4de3b26b364c870de480c6d103/src/main/java/org/spoorn/myloot/mixin/BlockEntityMixin.java#L31 which is called during loading a chunk.

spoorn commented 2 years ago

From the stacktrace

at net.minecraft.class_3218.method_8408(class_3218.java:1377)

This is the ServerWorld.updateNeighbors() method (or ServerLevel).

Code snippet from Fabric mapped:

    @Override
    public boolean setBlockState(BlockPos pos, BlockState state, int flags, int maxUpdateDepth) {
        if (this.isOutOfHeightLimit(pos)) {
            return false;
        }
        if (!this.isClient && this.isDebugWorld()) {
            return false;
        }
        WorldChunk worldChunk = this.getWorldChunk(pos);
        Block block = state.getBlock();
        BlockState blockState = worldChunk.setBlockState(pos, state, (flags & Block.MOVED) != 0);
        if (blockState != null) {
            BlockState blockState2 = this.getBlockState(pos);
            if ((flags & Block.SKIP_LIGHTING_UPDATES) == 0 && blockState2 != blockState && (blockState2.getOpacity(this, pos) != blockState.getOpacity(this, pos) || blockState2.getLuminance() != blockState.getLuminance() || blockState2.hasSidedTransparency() || blockState.hasSidedTransparency())) {
                this.getProfiler().push("queueCheckLight");
                this.getChunkManager().getLightingProvider().checkBlock(pos);
                this.getProfiler().pop();
            }
            if (blockState2 == state) {
                if (blockState != blockState2) {
                    this.scheduleBlockRerenderIfNeeded(pos, blockState, blockState2);
                }
                if ((flags & Block.NOTIFY_LISTENERS) != 0 && (!this.isClient || (flags & Block.NO_REDRAW) == 0) && (this.isClient || worldChunk.getLevelType() != null && worldChunk.getLevelType().isAfter(ChunkHolder.LevelType.TICKING))) {
                    this.updateListeners(pos, blockState, state, flags);
                }
                if ((flags & Block.NOTIFY_NEIGHBORS) != 0) {
                    this.updateNeighbors(pos, blockState.getBlock());
                    if (!this.isClient && state.hasComparatorOutput()) {
                        this.updateComparators(pos, block);
                    }
                }
                if ((flags & Block.FORCE_STATE) == 0 && maxUpdateDepth > 0) {
                    int i = flags & ~(Block.NOTIFY_NEIGHBORS | Block.SKIP_DROPS);
                    blockState.prepare(this, pos, i, maxUpdateDepth - 1);
                    state.updateNeighbors(this, pos, i, maxUpdateDepth - 1);
                    state.prepare(this, pos, i, maxUpdateDepth - 1);
                }
                this.onBlockChanged(pos, blockState, blockState2);
            }
            return true;
        }
        return false;
    }

updateNeighbors() is called after getWorldChunk() meaning for the current block we are trying to replace, the chunk has already successfully loaded, yet when it's updating neighbors (which triggers a getBlockState call, the neighbor's chunk still has to be loaded.

From the thread dump in crash report:

"Worker-Main-21" daemon prio=5 Id=74 WAITING on java.util.concurrent.CompletableFuture$Signaller@5ad6596
    at java.base@17.0.1/jdk.internal.misc.Unsafe.park(Native Method)
    -  waiting on java.util.concurrent.CompletableFuture$Signaller@5ad6596
    at java.base@17.0.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
    at java.base@17.0.1/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1864)
    at java.base@17.0.1/java.util.concurrent.ForkJoinPool.compensatedBlock(ForkJoinPool.java:3449)
    at java.base@17.0.1/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3432)
    at java.base@17.0.1/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1898)
    at java.base@17.0.1/java.util.concurrent.CompletableFuture.join(CompletableFuture.java:2117)
    at net.minecraft.class_3215.getChunkOffThread(class_3215.java:1184)
    ...

It seems some thread is WAITING on a CompletableFuture to load a chunk.

Again, to summarize from the top-level stacktrace where LootableContainerReplacer is trying to replace blocks, the thread is parked waiting for a neighbor block's chunk to load:

"Server thread" prio=8 Id=76 TIMED_WAITING on java.lang.String@73c8c694
    at java.base@17.0.1/jdk.internal.misc.Unsafe.park(Native Method)
    -  waiting on java.lang.String@73c8c694
    at java.base@17.0.1/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:252)
    at net.minecraft.class_1255.method_20813(class_1255.java:152)
    at net.minecraft.class_1255.method_18857(class_1255.java:142)
    at net.minecraft.class_3215.getChunkBlocking(class_3215.java:1276)
    at net.minecraft.class_3215.method_12121(class_3215.java:1172)
    at net.minecraft.class_3215.md7983d0$lambda$getChunkOffThread$0$0(class_3215.java:1184)
    at net.minecraft.class_3215$$Lambda$16430/0x0000000802ee3b80.get(Unknown Source)
    ...

The code snippet from ThreadExecutor where we are parking:

    public void runTasks(BooleanSupplier stopCondition) {
        ++this.executionsInProgress;

        try {
            while(!stopCondition.getAsBoolean()) {
                if (!this.runTask()) {
                    this.waitForTasks();
                }
            }
        } finally {
            --this.executionsInProgress;
        }

    }

    protected void waitForTasks() {
        Thread.yield();
        LockSupport.parkNanos("waiting for tasks", 100000L);
    }

The main server thread isn't stuck WAITING, it's stuck in a TIMED_WAITING infinite loop it seems like - because the neighbor's chunk it's trying to load is not loading before Watchdog times out the server at 60s. But there is a thread "Worker-Main-21" which may be that neighbor's chunk and deadlocked, causing the main server thread to become technically deadlocked via an infinite loop of timed parks.

If that's the case, then myLoot is only appearing on the stacktrace because it makes that call to setBlockEntity, showcasing and running into the deadlock on the main server thread - but not the root cause

spoorn commented 2 years ago

crash-2022-04-06_13.28.06-server.txt

I was able to reproduce it but this time with vanilla Minecraft's Server trying to run through the tasks left before finishing a tick:

    protected void runTasksTillTickEnd() {
        this.runTasks();
        this.runTasks(() -> !this.shouldKeepTicking());
    }

But obviously the tick never ends because the chunk that's trying to be loaded never finishes.

spoorn commented 2 years ago

Finally found the culprit!

This is the exact modpack link I've been testing on: https://www.curseforge.com/minecraft/modpacks/spoornpack-fabric/files/3736502

I was able to reproduce the issue consistently using world seed "2431689342422665529" and teleporting to coordinates 2730 129 -4609 then moving directly West until chunks just stopped loading and we hit the deadlock. Server view/simulation distance = 10.

I removed all structure mods to confirm this deadlock is caused by a structure mod (removing all structure mods fixed it), and re-added half at a time until I narrowed it down to this mod:

Unstructured 0.3.2+1.18.2: https://www.curseforge.com/minecraft/mc-mods/unstructured/files/3729844

Someone else also ran into the same issue: https://www.curseforge.com/minecraft/mc-mods/unstructured?comment=20

Wonder if there's a way to make this error easier to find, or at least not so vague in the crash report so players can know this is due to a deadlock with chunk loading. Might look into it and add a mod for that, maybe into https://www.curseforge.com/minecraft/mc-mods/quality-of-life-qol-fixes

For now, will close this as it's root caused to Unstructured, not a myLoot issue