itzg / docker-mc-backup

Provides a side-car container to backup itzg/minecraft-server world data
https://hub.docker.com/r/itzg/mc-backup
MIT License
334 stars 52 forks source link

Backup fails after several iterations using rclone #137

Closed landonpierce closed 1 year ago

landonpierce commented 1 year ago

I am currently using the docker-mc-backup image with docker-compose to backup my minecraft server (also using the itzg/minecraft-server docker image). After several successful backups, the backup job will randomly start failing with the error message: tar: ./config/spark/tmp/spark-2928b2c36397e-profile-data.jfr.tmp: Cannot stat: No such file or directory

In order to get the backups working again, I have to destroy and re-create the backup container. The restart that occurs as part of the failure does not fix the issue. The restart must be manual for it to fix the issue.

I'm utilizing the rclone backup method, and rclone is configured to save the backup to an Azure Blob Storage account. Here are my logs and my docker-compose.yml file (scrubbed of all sensitive data). I've done some basic poking into the issue, but can't find any reference to the file in the error message in the source code. Let me know if I can provide any other info to help with debugging.

Here are the logs. Note the successful backup at 2023-04-24T01:59:13, followed by a failure at 2023-04-24T03:59:13, followed by a manual restart of the container at 2023-04-24T04:02:17 and then a successful series of backups until several hours later at 2023-04-24T16:31:22.

2023-04-24T01:45:45+0000 INFO waiting initial delay of 10m...
2023-04-24T01:55:45+0000 INFO waiting for rcon readiness...
2023-04-24T01:55:45+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T01:55:45+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T01:55:57+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T01:55:57+0000 INFO Command executed successfully sync
2023-04-24T01:55:57+0000 INFO Backing up content in /data to /backups/world-20230424-015557.tgz
2023-04-24T01:59:13+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T01:59:13+0000 INFO sleeping 2h...
2023-04-24T03:59:13+0000 INFO waiting for rcon readiness...
2023-04-24T03:59:13+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T03:59:13+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T03:59:20+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T03:59:20+0000 INFO Command executed successfully sync
2023-04-24T03:59:20+0000 INFO Backing up content in /data to /backups/world-20230424-035920.tgz
tar: ./config/spark/tmp/spark-2928b2c36397e-profile-data.jfr.tmp: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
2023-04-24T04:02:16+0000 ERROR tar exited with code 2! Aborting
2023-04-24T04:02:16+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T04:02:17+0000 INFO waiting initial delay of 10m...
2023-04-24T04:12:17+0000 INFO waiting for rcon readiness...
2023-04-24T04:12:17+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T04:12:17+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T04:12:24+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T04:12:24+0000 INFO Command executed successfully sync
2023-04-24T04:12:24+0000 INFO Backing up content in /data to /backups/world-20230424-041224.tgz
2023-04-24T04:15:36+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T04:15:36+0000 INFO sleeping 2h...
2023-04-24T06:15:36+0000 INFO waiting for rcon readiness...
2023-04-24T06:15:36+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T06:15:36+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T06:15:38+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T06:15:38+0000 INFO Command executed successfully sync
2023-04-24T06:15:38+0000 INFO Backing up content in /data to /backups/world-20230424-061538.tgz
2023-04-24T06:18:45+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T06:18:45+0000 INFO sleeping 2h...
2023-04-24T08:18:45+0000 INFO waiting for rcon readiness...
2023-04-24T08:18:45+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T08:18:45+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T08:18:47+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T08:18:47+0000 INFO Command executed successfully sync
2023-04-24T08:18:47+0000 INFO Backing up content in /data to /backups/world-20230424-081847.tgz
2023-04-24T08:21:53+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T08:21:53+0000 INFO sleeping 2h...
2023-04-24T10:21:53+0000 INFO waiting for rcon readiness...
2023-04-24T10:21:53+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T10:21:53+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T10:21:55+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T10:21:55+0000 INFO Command executed successfully sync
2023-04-24T10:21:55+0000 INFO Backing up content in /data to /backups/world-20230424-102155.tgz
2023-04-24T10:25:02+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T10:25:02+0000 INFO sleeping 2h...
2023-04-24T12:25:02+0000 INFO waiting for rcon readiness...
2023-04-24T12:25:02+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T12:25:02+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T12:25:04+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T12:25:04+0000 INFO Command executed successfully sync
2023-04-24T12:25:04+0000 INFO Backing up content in /data to /backups/world-20230424-122504.tgz
2023-04-24T12:28:09+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T12:28:09+0000 INFO sleeping 2h...
2023-04-24T14:28:09+0000 INFO waiting for rcon readiness...
2023-04-24T14:28:09+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T14:28:09+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T14:28:11+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T14:28:11+0000 INFO Command executed successfully sync
2023-04-24T14:28:11+0000 INFO Backing up content in /data to /backups/world-20230424-142811.tgz
2023-04-24T14:31:17+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T14:31:18+0000 INFO sleeping 2h...
2023-04-24T16:31:18+0000 INFO waiting for rcon readiness...
2023-04-24T16:31:18+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T16:31:18+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T16:31:22+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T16:31:22+0000 INFO Command executed successfully sync
2023-04-24T16:31:22+0000 INFO Backing up content in /data to /backups/world-20230424-163122.tgz
tar: ./config/spark/tmp/spark-2bb947d72bb30-profile-data.jfr.tmp: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
2023-04-24T16:34:13+0000 ERROR tar exited with code 2! Aborting
2023-04-24T16:34:13+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T16:34:15+0000 INFO waiting initial delay of 10m...
2023-04-24T16:44:15+0000 INFO waiting for rcon readiness...
2023-04-24T16:44:15+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T16:44:15+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T16:44:17+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T16:44:17+0000 INFO Command executed successfully sync
2023-04-24T16:44:17+0000 INFO Backing up content in /data to /backups/world-20230424-164417.tgz
tar: ./config/spark/tmp/spark-2bc4a18f81149-profile-data.jfr.tmp: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
2023-04-24T16:47:06+0000 ERROR tar exited with code 2! Aborting
2023-04-24T16:47:06+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T16:47:08+0000 INFO waiting initial delay of 10m...
2023-04-24T16:57:08+0000 INFO waiting for rcon readiness...
2023-04-24T16:57:08+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T16:57:08+0000 INFO Command executed successfully rcon-cli save-off
2023-04-24T16:57:19+0000 INFO Command executed successfully rcon-cli save-all flush
2023-04-24T16:57:19+0000 INFO Command executed successfully sync
2023-04-24T16:57:19+0000 INFO Backing up content in /data to /backups/world-20230424-165719.tgz
tar: ./config/spark/tmp/spark-2bcffb4baa774-profile-data.jfr.tmp: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
2023-04-24T17:00:08+0000 ERROR tar exited with code 2! Aborting
2023-04-24T17:00:08+0000 INFO Command executed successfully rcon-cli save-on
2023-04-24T17:00:10+0000 INFO waiting initial delay of 10m...

And here is my docker-compose.yml that runs this container alongside my server:

version: "3"

services:
  mc:
    image: itzg/minecraft-server
    container_name: mc
    ports:
      - 25565:25565
    environment:
      EULA: "TRUE"
      RCON_PASSWORD: "<redacted>"
      DIFFICULTY: "NORMAL"
      TYPE: "FTBA"
      MEMORY: "12G"
      FTB_MODPACK_ID: "101"
      FTB_MODPACK_VERSION_ID: "6430"
      ALLOW_FLIGHT: "true"
      SPAWN_PROTECTION: "0"
    tty: true
    stdin_open: true
    restart: always
    volumes:
      # attach a directory relative to the directory containing this compose file
      - ./worlddata:/data
  backups:
    image: itzg/mc-backup
    container_name: backups
    hostname: "ftbbackups"
    environment:
      RCON_PASSWORD: "<redacted>"
      RCON_HOST: mc
      BACKUP_INTERVAL: "2h"
      BACKUP_METHOD: "rclone"
      PAUSE_IF_NO_PLAYERS: "true"
      INITIAL_DELAY: "10m"
      EXCLUDES: "*.jar,cache,logs"
      RCLONE_REMOTE: "<redacted>"
      RCLONE_COMPRESS_METHOD: "gzip"
      DEST_DIR: "/backups"
      RCLONE_DEST_DIR: "<redacted>"
    restart: always
    volumes:
      - ./worlddata:/data:ro
      - ./rclone.conf:/config/rclone/rclone.conf:ro
itzg commented 1 year ago

At first the relative path it reported was confusing to me, but then I remembered it'll be relative to /data. The "jfr" part of the filename refers to Java flight recorder, so it's as if a mod, etc left behind a profiler recording session's temp file.

I'd be curious if that file actually exists over on the Minecraft data side and if so, does removing the file manually fix it?

I'm also wondering if tar might have an option to ignore such file state?

landonpierce commented 1 year ago

@itzg So I do think you're on the right track about a mod or something moving files around. I noticed over the weekend that the backup started complaining about a file that I know I happened to accidentally move while it was executing a backup. It seems as though if a mod deletes or otherwise moves a file during the backup execution that tar was supposed to be archiving, it will generate an error and the backup will fail.

I guess this is probably an issue that is semi-unique to modded minecraft, since vanilla doesn't change/move/delete files in the same way.

itzg commented 1 year ago

I was just thinking: if it's consistently failing on tmp and/or jfr files, then you could try adding that to the EXCLUDES, like

EXCLUDES: "*.jar,cache,logs,*.tmp"
imstevenxyz commented 1 year ago

I was just thinking: if it's consistently failing on tmp and/or jfr files, then you could try adding that to the EXCLUDES, like

EXCLUDES: "*.jar,cache,logs,*.tmp"

I had a similar problem with Purpur and the temporary files of its spark plugin. Excluding *.tmp fixed the backup. It might be a good idea to have this as default since such files rarely need to be backed up.

itzg commented 1 year ago

It might be a good idea to have this as default since such files rarely need to be backed up.

Great idea 😀