jammsen / docker-palworld-dedicated-server

Docker container to easily provision and manage Palworld Dedicated Server
https://hub.docker.com/r/jammsen/palworld-dedicated-server
MIT License
897 stars 151 forks source link

fix: replace gzip with zstd for faster backup packaging and to prevent backup errors #254

Closed Xm798 closed 2 months ago

Xm798 commented 3 months ago

Background:

Our previous packaging process with gzip was not only time-consuming but also susceptible to generating warning messages if files were modified during the lengthy packaging period. Such warnings could be mistakenly interpreted by the backupmanager script as a failure in the backup process. For example:

palworld-dedicated-server  | > RCON: Broadcasted: Creating-backup
palworld-dedicated-server  | tar: Saved/SaveGames/0/xxxxx/backup/world: file changed as we read it
palworld-dedicated-server  | tar: Saved/SaveGames/0/xxxxx: file changed as we read it
palworld-dedicated-server  | [2024-04-04 14:00:12] [LOG] RCON executed the command. broadcast Backup-failed
palworld-dedicated-server  | > RCON: Broadcasted: Backup-failed

Solution:

By switching to zstd, we're speeding up our packaging process big time. This means there's less chance for files to get changed while we're packaging, which cuts down on those annoying warning messages. Before, these warnings could make us think our backup failed when it actually didn't. On my machine, packing a 200MB world archive used to take 12 seconds with gzip, but now with zstd, it's down to just about 0.8 seconds. That's a huge boost in how quickly and reliably we can manage our backups.

Xm798 commented 3 months ago

If there are specific reasons why we must stick with gz, then I suggest we add the --warning=no-file-changed and --warning=no-file-removed options to avoid the common warning messages we're likely to encounter. For more details, check out: https://www.gnu.org/software/tar/manual/html_section/warnings.html

Callum027 commented 3 months ago

If --warning=no-file-changed is an algorithm-agnostic parameter, perhaps we should set it in addition to switching to zstd?

Xm798 commented 3 months ago

If --warning=no-file-changed is an algorithm-agnostic parameter, perhaps we should set it in addition to switching to zstd?

That's a good point. In my tests, zstd was so fast that I didn't encounter any file change warnings. But adding --warning=no-file-changed might indeed be a more robust approach. Thanks for the suggestion!

jammsen commented 3 months ago

Hey @Xm798 - Thanks for the contribution. Are there some benchmark results that provide proof and facts the zstd is far superior then gzip, like you say it is? If yes, can you please share with me your sources?

Callum027 commented 3 months ago

Hey @Xm798, I've another suggestions for this.

In the new version, Palworld now has its own built-in backup mechanism, which gets saved to the backup directory. This folder is being included with the container backup, exponentially increasing the size of the backups. As double backups are not really useful, can you also add --exclude=backup to make sure those files are not included into the tarball?

@jammsen - Quick googling brings up a few pages that demonstrate that zstd compression, at level 3 or so, is both faster and creates smaller files than gzip:

jammsen commented 3 months ago

@jammsen - Quick googling brings up a few pages that demonstrate that zstd compression, at level 3 or so, is both faster and creates smaller files than gzip:

Yeah i knew i could google for it, but i wanted to see his data. Thats why i asked. He says its very much faster, but never shows it.

Xm798 commented 3 months ago

@jammsen - Quick googling brings up a few pages that demonstrate that zstd compression, at level 3 or so, is both faster and creates smaller files than gzip

Yeah i knew i could google for it, but i wanted to see his data. Thats why i asked. He says its very much faster, but never shows it.

I conducted another test on my Palworld server, and the data is approximately as follows:

╰─❯ time tar cfz backup_test.tgz Saved
tar: Saved/SaveGames/0/2EFA86D24BF543E7AFFD4C18D5FCE34E/backup/world: file changed as we read it
tar: Saved/SaveGames/0/2EFA86D24BF543E7AFFD4C18D5FCE34E/Level.sav: file changed as we read it
tar: Saved/SaveGames/0/2EFA86D24BF543E7AFFD4C18D5FCE34E: file changed as we read it
tar cfz backup_test.tgz Saved  14.26s user 0.93s system 101% cpu 14.939 total

╰─❯ time tar -I zstd -cf backup_test.tar.zst Saved
tar -I zstd -cf backup_test.tar.zst Saved  0.58s user 0.90s system 187% cpu 0.793 total

A 285MB Saved folder, when compressed with gzip, took about 15 seconds, and there was a warning from tar about files being modified. In contrast, zstd completed the compression in just about 0.8 seconds, successfully finishing the packaging.

image

Xm798 commented 3 months ago

Hey @Xm798, I've another suggestions for this.

In the new version, Palworld now has its own built-in backup mechanism, which gets saved to the backup directory. This folder is being included with the container backup, exponentially increasing the size of the backups. As double backups are not really useful, can you also add --exclude=backup to make sure those files are not included into the tarball?

@jammsen - Quick googling brings up a few pages that demonstrate that zstd compression, at level 3 or so, is both faster and creates smaller files than gzip:

  1. Based on the GNU's tar documentation, I added the --no-file-changed option as "file-changed keyword applies only if used together with the ‘--ignore-failed-read’ option".

  2. Upon reevaluating the current setup post your feedback, I noticed that indeed, following the recent update two days ago, Palworld automatically initiates world saves of the /Pal/Saved/SaveGames/0/xxxx/ path. This new feature correlates with the increased frequency of tar backup errors, likely because the backup folder is over 200M in size (on my Pal server), slowing down the tar process while Palworld simultaneously modifies the files.

  3. Consequently, considering the insights and the observed redundancy leading to space wastage, it might be wise to heed @Callum027 advice to exclude the Pal backup folder from our backups. The dual backup system is evidently consuming excess storage, which became painfully apparent when my hard drive quickly filled up today—truly a frustrating situation.

Callum027 commented 3 months ago
  1. Based on the GNU's tar documentation, I added the --no-file-changed option as "file-changed keyword applies only if used together with the ‘--ignore-failed-read’ option".

This didn't make sense, so I checked man tar on my machines.

image

Turns out that only applies to --warning=failed-read (the website appears to have rendered the manual incorrectly). You don't need --ignore-failed-read set for the warnings you've disabled in this PR.

Xm798 commented 2 months ago

Noticed that PR#259 submitted a modification to exclude the backup directory, which is exactly what I intended to do later. After excluding the backups managed by Palworld, the packaging time has been reduced to an acceptable level again (about 900ms for me using gz), and there were no packaging errors. Therefore, this PR can be closed now. Thank you all for your help.

jammsen commented 2 months ago

@Xm798 - No thanks for your help and your contribution!