Various errors due to concurrent usage of MinecraftServer.saveAll - TxB 3

Szum123321 / textile_backup

Backup Mod for Fabric

https://www.curseforge.com/minecraft/mc-mods/textile-backup

GNU General Public License v3.0

90 stars 30 forks source link

Various errors due to concurrent usage of MinecraftServer.saveAll - TxB 3 #123

Closed Szum123321 closed 1 year ago

Szum123321 commented 1 year ago

Since the release of Textile Backup 3.0.0, through various channels, I've recieved a number of bug reports regarding an IndexOutOfBounds exception during compression.

From my own testing, it appears to be both random and occurs in both minecraft 1.19 and 1.20.

So far, I've been unsucessful in determining the case.

If you experience similar issue please comment below. Try to provide as much info as possilble.

In the meantime i will do my best to diagnose the issue myself.

MishaEgorovCA commented 1 year ago

Has been happening a lot recently running a 1.20 and 1.20.1 server (about once a day). Here are attached logs leading up to it and crash report if it helps.

2023-06-21-1.log crash-2023-06-21_16.17.57-server.txt

Additionally the server freezes on "[Textile Backup] Starting backup" instead of shutting down after the error, forcing me to force stop it manually thus it is where the log ends. This also prevents my auto restart on crash script to not take affect since its stuck on Stating backup after throwing an error. Pause is caused by the crash handler not you mb

MishaEgorovCA commented 1 year ago

New crash error on backup this time, don't know if it is related: crash-2023-06-23_12.49.08-server.txt

1971844335 commented 1 year ago

新建文本文档 (2).txt 备份报错

tomozbot commented 1 year ago

From what I can tell the issues that @me100000000000 is experiencing is not an IndexOutOfBoundsException as @Szum123321 describes. Regardless I am having the same crashing issues at @me100000000000. Seems to occur rarely just after the backup starting message is sent.

Anmegum commented 1 year ago

I get the same error but only when Very Many Players (VMP) is running as well. I deactivated VMP after the last crash, never had an issue since then. I already reported this to VMP as well. crash-2023-06-19_23.29.51-server.txt

tomozbot commented 1 year ago

Not using VMP personally. Perhaps its a conflict with a shared dependency?

Anmegum commented 1 year ago

I get the same error but only when Very Many Players (VMP) is running as well. I deactivated VMP after the last crash, never had an issue since then. I already reported this to VMP as well. crash-2023-06-19_23.29.51-server.txt

Server crashed again after 11 days, This time without VMP active and just after Textile Backup started saving the world. crash-2023-06-30_19.17.50-server.txt

nmsl66687 commented 1 year ago

crash-2023-07-09_10.09.31-server.txt

borisshoes commented 1 year ago

Just to bump this issue again, it looks like the cause was found in #136. Calling the server save-all on the main thread before creating the compression threads should fix the issue?

Szum123321 commented 1 year ago

Yeah that was probably a dumb move on my side. My hope was to prevent #81. The change will now be reverted in 3.1.1.

MichailiK commented 1 year ago

Watchdog timeouts are still fairly easy to prevent. Disable auto-saving & save the worlds on the server thread, then perform the backup off-thread, then enable auto-saving again. I've done that in a somewhat hacky and maybe slightly flawed way by shuffling a few lines of code around, to ensure we're not using using any of Minecraft's functions or properties outside the server thread.

I've opted not to open a PR for this because, as far as I can tell, making a proper fix requires some light restructuring which can be done in many ways.

Szum123321 commented 1 year ago

That is exactly what i've had done before.

My decsiion was motivated by following reason. We should execute saveAll just before we actually begin the compression process to end up with the most up to date state. Futhermore during compression no changes can be written to the haddrive as that would corrupt the ongoing backup. Thats why i decided to move saveAll call in the first place.

Now i believe i have a better solution. https://github.com/Szum123321/textile_backup/blob/81_125_test/src/main/java/net/szum123321/textile_backup/core/create/ExecutableBackup.java

MichailiK commented 1 year ago

server.submitAndJoin

Oh wow, did not know that this exists 😅 That makes things much simpler

Szum123321 commented 1 year ago

fixed in 3.1.2