borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
10.73k stars 734 forks source link

cache_sync parse error: Unexpected object: integer #8185

Closed ginkel closed 2 months ago

ginkel commented 2 months ago

Have you checked borgbackup docs, FAQ, and open GitHub issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

BUG

System information. For client/server mode post info for both machines.

Your borg version (borg -V).

1.2.7

Operating system (distribution) and version.

Ubuntu 20.04

Hardware / network configuration, and filesystems used.

Hetzner AX51 / Storage Box, btrfs on client, unknown on StorageBox

How much data is handled by borg?

1 TB

Full borg commandline that lead to the problem (leave away excludes and passwords)

/usr/bin/borg info --glob-archives {fqdn}-* --json ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./borg/pegasus

Describe the problem you're observing.

During a running backup the host being backupped ran out of free disk space. After remedying the situation and aborting the running borg process (by rebooting the host) borg aborts while syncing caches:

# borgmatic info
cache_sync parse error: Unexpected object: integer
Local Exception
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 5343, in main
    exit_code = archiver.run(args)
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 5263, in run
    return set_ec(func(args))
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 182, in wrapper
    with Cache(repository, kwargs['key'], kwargs['manifest'],
  File "/usr/lib/python3/dist-packages/borg/cache.py", line 387, in __new__
    return local()
  File "/usr/lib/python3/dist-packages/borg/cache.py", line 378, in local
    return LocalCache(repository=repository, key=key, manifest=manifest, path=path, sync=sync,
  File "/usr/lib/python3/dist-packages/borg/cache.py", line 497, in __init__
    self.sync()
  File "/usr/lib/python3/dist-packages/borg/cache.py", line 904, in sync
    self.chunks = create_master_idx(self.chunks)
  File "/usr/lib/python3/dist-packages/borg/cache.py", line 858, in create_master_idx
    fetch_and_build_idx(archive_id, decrypted_repository, archive_chunk_idx)
  File "/usr/lib/python3/dist-packages/borg/cache.py", line 767, in fetch_and_build_idx
    sync.feed(data)
  File "src/borg/hashindex.pyx", line 532, in borg.hashindex.CacheSynchronizer.feed
ValueError: cache_sync_feed failed: Unexpected object: integer

Platform: Linux pegasus.XXXXXX.net 5.4.0-174-generic #193-Ubuntu SMP Thu Mar 7 14:29:28 UTC 2024 x86_64
Linux: Unknown Linux  
Borg: 1.2.7  Python: CPython 3.8.10 msgpack: 0.6.2 fuse: llfuse 1.3.6 [pyfuse3,llfuse]
PID: 2269223  CWD: /root/.cache
sys.argv: ['/usr/bin/borg', 'info', '--glob-archives', '{fqdn}-*', '--json', 'ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./borg/pegasus']
SSH_ORIGINAL_COMMAND: None

[the command invocation is via borgmatic, but the exact borg command line shows up in the output]

This is possibly a duplicate of #3454, which was apparently fixed in 1.1.4.

I tried if removing ~/.cache/borg solves the issue, but it does not (so this issue is probably not due to a corrupt local cache).

I am backing up the host to another borg repo, where the issue does not occur.

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

The issue is reproducible on each invocation, but I'm not sure if (and how) I'd be able to bring an arbitrary repo into that state.

Include any warning/errors/backtraces from the system logs

See above for the full backtrace.

ThomasWaldmann commented 2 months ago

Hmm, that code reads the archive metadata streams from the repo and rebuild the local chunkindex from that.

Looks like one archive (can be also a "hidden" .checkpoint archive) might be corrupt.

Guess I would do the following:

If you used borg < 1.2.5 for that repo follow the CVE-related steps from the top of the changelog (that is important in case you want to use borg check --repair later)

Then:

Does that fix it?

If not, run borg check --repair REPO.

ginkel commented 2 months ago

Thanks for the speedy response! :-)

The oldest archive in the repo is from Jan 2023, so the repo has been touched by a pre-1.2.5 version. The PPA I am using to install borg, does not yet prove 1.2.8, though. :-/

There is one .checkpoint archive. Is it safe to execute the steps you mentioned as long as I do not run borg check --repair before applying the steps from the changelog?

ThomasWaldmann commented 2 months ago

Yes, as long as you don't run borg check --repair, it won't kill any archives with TAM issues.

For the CVE-related upgrade steps, make sure to read the docs for your borg version (e.g. the steps in the 1.2.8 docs won't work when using borg 1.2.7).

ThomasWaldmann commented 2 months ago

I checked the current (1.2.7) source: yes, this seems the same issue (a corrupt archive) as described there:

https://github.com/borgbackup/borg/issues/3454#issuecomment-352294792

ginkel commented 2 months ago

Thanks! I applied the upgrade steps from the 1.2.7 changelog - the repo was't affected. Deleting the candidates of the affected archives did not work, currently running borg check --repair.

ginkel commented 2 months ago

It seems that borg check --repair fixed the issue:

borg check --repair ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./borg/pegasus                                                              /usr/bin/borg:6: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  from pkg_resources import load_entry_point
This is a potentially dangerous function.
check --repair might lead to data loss (for kinds of corruption it is not
capable of dealing with). BE VERY CAREFUL!

Type 'YES' if you understand this and want to continue: YES
Enter passphrase for key ssh://uXXXXXX@uXXXXXX.your-storagebox.de:23/./borg/pegasus: 
Unpacker crashed while unpacking item metadata, trying to resync... [chunk: 003515_7cddc8c390eb7ab3643fc106e4bf98efad5f3c26fbc4e0fc534efa0e8604d715]
5042 orphaned objects found!
Archive consistency check complete, problems found.

New backups are no longer failing.

Thanks for your help!