borgbackup / borg

Deduplicating archiver with compression and authenticated encryption.
https://www.borgbackup.org/
Other
11k stars 739 forks source link

Data integrity error: Segment entry data short read [segment 2, offset 36478608]: expected 676, got 359 bytes #8189

Closed Massedil closed 5 months ago

Massedil commented 5 months ago

Have you checked borgbackup docs, FAQ, and open GitHub issues?

oui

Is this a BUG / ISSUE report or a QUESTION?

BUG

System information. For client/server mode post info for both machines.

My config :

Your borg version (borg -V).

From Debian repo :

Operating system (distribution) and version.

Hardware / network configuration, and filesystems used.

How much data is handled by borg?

# borg info
[...]
                       Original size      Compressed size    Deduplicated size
All archives:              107.91 GB             36.56 GB              9.77 GB
[...]

Full borg commandline that lead to the problem (leave away excludes and passwords)

From local.example :

# borg check /mnt
Data integrity error: Segment entry data short read [segment 2, offset 36478608]: expected 676, got 359 bytes

Describe the problem you're observing.

No backup can occurs and I can't repair my backup.

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

Yes.

Include any warning/errors/backtraces from the system logs

On local.example :

# borg check /mnt
Data integrity error: Segment entry data short read [segment 2, offset 36478608]: expected 676, got 359 bytes
^CKeyboard interrupt
# borg check --repair /mnt
This is a potentially dangerous function.
check --repair might lead to data loss (for kinds of corruption it is not
capable of dealing with). BE VERY CAREFUL!

Type 'YES' if you understand this and want to continue: yes
Invalid answer, aborting.
# borg check --repair /mnt
This is a potentially dangerous function.
check --repair might lead to data loss (for kinds of corruption it is not
capable of dealing with). BE VERY CAREFUL!

Type 'YES' if you understand this and want to continue: YES
Data integrity error: Segment entry data short read [segment 2, offset 36478608]: expected 676, got 359 bytes
Fatal Python error: Bus error

Current thread 0x00007fecece1e740 (most recent call first):
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 1476 in recover_segment
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 984 in check
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 343 in do_check
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 177 in wrapper
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 4622 in run
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 4690 in main
  File "/usr/bin/borg", line 33 in <module>
Erreur du bus
# borg check --repair /mnt
Killed stale lock XXXXXXXXXX@95529533698.816865-0.
Removed stale exclusive roster lock for host XXXXXXXXXX@95529533698 pid 816865 thread 0.
Removed stale exclusive roster lock for host XXXXXXXXXX@95529533698 pid 816865 thread 0.
This is a potentially dangerous function.
check --repair might lead to data loss (for kinds of corruption it is not
capable of dealing with). BE VERY CAREFUL!

Type 'YES' if you understand this and want to continue: YES
Data integrity error: Segment entry data short read [segment 2, offset 36478608]: expected 676, got 359 bytes
Fatal Python error: Bus error

Current thread 0x00007f2059f8f740 (most recent call first):
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 1476 in recover_segment
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 984 in check
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 343 in do_check
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 177 in wrapper
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 4622 in run
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 4690 in main
  File "/usr/bin/borg", line 33 in <module>
Erreur du bus

On remote.example :

remote.example# borg check /XXXXXXXXXX/backup
Failed to create/acquire the lock /XXXXXXXXXX/backup/lock.exclusive (timeout).
remote.example# borg break-lock /XXXXXXXXXX/backup
remote.example# borg check /XXXXXXXXXX/backup
Local Exception
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 5213, in main
    exit_code = archiver.run(args)
                ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 5144, in run
    return set_ec(func(args))
                  ^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 183, in wrapper
    return method(self, args, repository=repository, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 343, in do_check
    if not repository.check(repair=args.repair, save_space=args.save_space, max_duration=args.max_duration):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 1039, in check
    objects = list(self.io.iter_objects(segment))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 1512, in iter_objects
    size, tag, key, data = self._read(fd, self.header_fmt, header, segment, offset,
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 1606, in _read
    data = fd.read(length)
           ^^^^^^^^^^^^^^^
OSError: [Errno 5] Input/output error

Platform: Linux a20 6.1.0-18-armmp-lpae #1 SMP Debian 6.1.76-1 (2024-02-01) armv7l
Linux: Unknown Linux  
Borg: 1.2.4  Python: CPython 3.11.2 msgpack: 1.0.3 fuse: pyfuse3 3.2.1 [pyfuse3,llfuse]
PID: 3396  CWD: /root
sys.argv: ['/usr/bin/borg', 'check', '/XXXXXXXXXX/backup']
SSH_ORIGINAL_COMMAND: None

remote.example# btrfs check --force --readonly /dev/sda1
Opening filesystem to check...
WARNING: filesystem mounted, continuing because of --force
Checking filesystem on /dev/sda1
UUID: 5da7c9af-8af2-48cf-aa9d-368744e735ed
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 13055619072 bytes used, no error found
total csum bytes: 12733372
total tree bytes: 16646144
total fs tree bytes: 1146880
total extent tree bytes: 229376
btree space waste bytes: 2516436
file data blocks allocated: 13038972928
 referenced 13037830144

# /usr/sbin/smartctl -q errorsonly -t short /dev/sda

# /usr/sbin/smartctl -t short /dev/sda
[...]
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      4510         -
# 2  Short offline       Completed without error       00%      4265         -
[...]
Massedil commented 5 months ago

Humf !

Beside the fact that btrfs check /dev/sda1 says everything is ok (even offline), it is not the case when I look in kernel log :

024-04-15T16:18:22.316672+02:00 a20 kernel: [876806.988444] BTRFS warning (device sda1): csum failed root 5 ino 831 off 4096 csum 0xca6697a5 expected csum 0x7a9c7f00 mirror 1
2024-04-15T16:18:22.316814+02:00 a20 kernel: [876806.988518] BTRFS error (device sda1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 865, gen 0
2024-04-15T16:18:22.319210+02:00 a20 kernel: [876807.000089] BTRFS warning (device sda1): csum failed root 5 ino 831 off 4096 csum 0xca6697a5 expected csum 0x7a9c7f00 mirror 1
2024-04-15T16:18:22.319357+02:00 a20 kernel: [876807.000159] BTRFS error (device sda1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 866, gen 0
2024-04-15T16:18:22.339604+02:00 a20 kernel: [876807.011380] BTRFS warning (device sda1): csum failed root 5 ino 831 off 4096 csum 0xca6697a5 expected csum 0x7a9c7f00 mirror 1
2024-04-15T16:18:22.339730+02:00 a20 kernel: [876807.011441] BTRFS error (device sda1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 867, gen 0

So I close the issue.