PaperMC / Starlight

Rewrites the light engine to fix lighting performance and lighting errors
Other
1.27k stars 135 forks source link

IForgeBlockGetter#getExistingBlockEntity does not perform off-thread check, causing deadlock with certain mods #197

Open joisful opened 1 year ago

joisful commented 1 year ago

Better MC v16/ Forge

latest.log

Would crash the server when trying to join. When i removed the mod it worked fine.

Spottedleaf commented 1 year ago

There is nothing I can do to resolve this issue. The problem lies with Forge's implementation of IForgeBlockGetter#getExistingBlockEntity:

The Vanilla Level#getBlockEntity method will avoid loading a chunk unless on the main thread, specifically because it knows the light engine can indirectly make calls to this method. It's certainly an awful fix from Vanilla but it's the one that has worked since 1.14. Forge's method does no such thread check and as a result blows up here - causing a deadlock.

Spottedleaf commented 1 year ago

Just a more detailed look at this:

The Forge implementation:

    @Nullable
    default BlockEntity getExistingBlockEntity(BlockPos pos)
    {
        if (this instanceof Level level)
        {
            if (!level.hasChunk(SectionPos.blockToSectionCoord(pos.getX()), SectionPos.blockToSectionCoord(pos.getZ())))
            {
                return null;
            }

            return level.getChunk(pos).getExistingBlockEntity(pos);
        }
        // removed all cases except Level
    }

There are a few problems with this implementation:

  1. The hasChunk call is not effective, as hasChunk() on ServerChunkCache returns whether the ticket level is at full status, not whether the chunk is full.
  2. The same issue I mentioned above, where it is not correctly returning null for off-thread access like Vanilla would.

A note about getChunk() is that off-main calls are scheduled to the main thread and then the main thread blocks on the request until the chunk is brought to FULL. There is no scenario where it actually completes synchronously, so in general calling this off-main on worldgen threads is dangerous and prone to deadlock (like this case).

It may be possible that Starlight is bringing the ticket level up to a level lower than what is currently requested (it brings the center chunk to FULL to keep it from unloading), which could cause the hasChunk method to return true. However, I doubt this is a useful case to look at as standard chunk loading caused by players will bring chunks up to a lower level than what Starlight would request. I could make Starlight bring it to a higher level, but I don't think it would fix the issue.

Additionally, Starlight performs edge checks on chunks during loading to fix lighting errors in some niche cases (Vanilla should as well but it doesn't) and that's why it appears to happen on Starlight - these modded blocks aren't present in world generation, so it is unlikely to run this path while lighting a chunk (but it is possible).

Spottedleaf commented 1 year ago

Re-opened as I can't do anything about it but it's still an issue