EngineHub / WorldEdit

🗺️ Minecraft map editor and mod
https://enginehub.org/worldedit/
Other
3.11k stars 872 forks source link

Fabric with worldedit hangs when stopping during shutdown #2459

Open petersv5 opened 9 months ago

petersv5 commented 9 months ago

WorldEdit Version

7.3.0-beta-03

Platform Version

Fabric Loader 0.15.3

Confirmations

Bug Description

The server hangs during shutdown in some cases. The was tracked down to the thread "WorldEdit Task Executor - 0" left running, whichin turn is due to the WorldEdit.executorService not being stopped. This particular task executor service is not run using daemon threads and thus require an explicit shutdown to terminate. If not stopped it will prevent the jvm from initiating the shutdown.

Expected Behavior

Sending /stop to the server should terminate the process in a reasonable time.

Reproduction Steps

  1. Make a selection
  2. //copy
  3. //schem save somefilename
  4. /stop

Observe that the fabric server does not termiante. If checking the remaining server status with jstack the thread "WorldEdit Task Executor - 0" is still running and not a daemon thread.

Anything Else?

I am preparing a PR for this, hopefully later today.

petersv5 commented 9 months ago

It seems that in addition a timer is created during shematic saves that is not cancelled. This leads to a thread "Timer-1" from java.util.TimerThread that also remains live through the shutdown. I'm still tracking this down before filing the PR.

octylFractal commented 9 months ago

Realistically for safety I believe that we should be properly shutting down the executor service on mod/plugin unload, not just making these daemon threads.

The Timer should automatically be GC'd after the schematic save is complete. If a reference remains, that is what should be fixed. I would be interested in a heap dump to see where this is going wrong.

petersv5 commented 9 months ago

The executor service should be properly shut down by the PR, that was not made a daemon thread.

The Timer is a bit funny. The actual timer that resolved the hang when it was made a daemon thread is only ever used by the FutureProgressListener constructor. I see an equal number of construltor calls and calls to the run() method which in turn cancels the timer. So in theory there should be no timer reverences anywhere. At the point of the hang all non-daemon threads are already dead except "Timer-1". Either one of the daemon threads hold the reference to the FutureProgressListner Timer or it holds it itself.

petersv5 commented 9 months ago

The Timer timer in FutureProgressListener is a static field. It is not going to go away, I guess. Is it actually correct to use a single Timer shared by all instances of FutureProgressListener? That should make them interfere with each other, I think.

There is another Timer in worldedit SessionManager that also keeps the TimerThread alive.

For the Timers it may actually be a better idea to make the threads a daemon. It is unlikely that any timer callbacks can do much good once the server has shut down which it will have done by the time the daemon-ness of the timer (or not) matters.

I am trying to make the timers go away by making some changes:

I still sometimes see the TimerThread being kept alive by Timer@ThreadReaper for a long while, but it eventually goes away. Still, this delay is a can be a problem for servers.

I will look a bit more at this tomorrow.

stale[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not been fully confirmed. It will be closed if no further activity occurs. Thank you for your contributions.

petersv5 commented 7 months ago

Will test latest and get back if it is still an issue.

github-actions[bot] commented 4 months ago

This issue has been automatically marked as stale because it has not been fully confirmed. It will be closed if no further activity occurs. Thank you for your contributions.

Fallen-Breath commented 1 week ago

This issue still exists in the latest release. Please review and reopen the issue, and consider setting the timer thread to a daemon thread

Environment

Steps to reproduce

  1. Start a fabric server with worldedit mod only
  2. Execute /searchitem stone in the console (for those version that requires a player to run the command, run it in game)
  3. Execute /stop in the console
  4. Wait for the server to stop, observe

Expected behavior: the server stops; Actual behavior: the server never stops, and hangs forever

More information

Log: https://pastebin.com/TXvr65Ei

jstack output (look at the Timer-1 thread): https://pastebin.com/jWEh5FE1

Heap dump with jmap -dump:format=b,file=heapdump.hprof <pid>. It's splitted into 2 files, to bypass github's attachment size limit. You need to decompress twice:

heapdump.hprof.xz.001.zip heapdump.hprof.xz.002.zip

Image

wizjany commented 1 week ago

It doesn't still exist but rather exists again - the fix was reverted because it caused other issues. Will re-open though.

Fallen-Breath commented 1 week ago

It doesn't still exist but rather exists again

This issue is also reproduce-able with at least:

If you look into the related commit 5eb9b779d7467ba3c893421a8ed099c47685f01e, you will find out that this issue was introduced in not later than 7.0.0-beta-05, affecting all MC versions in 1.13+

Fallen-Breath commented 1 week ago

the fix was reverted because it caused other issues

I doubt if the "fix" you refer here does fix this issue

Look at the FutureProgressListener class, no change has been made to this file since 2020. Obviously those changes before 2020 did not fix this issue as well

wizjany commented 1 week ago

2460 / #2570 / #2592