Open CrysK opened 2 months ago
It seems that you have issues in very old segment files (5, 7, 28, ...) and also in maybe recent segment files (18080). That looks quite bad and strange.
Ideas:
- either you had problems for a long time (did you ever run borg check before?)
This could be the case, but I hope not. Other old files, in the same partition, besides borg, still seem to be good.
Unfortunately, I have never run the check before.
- there is still an unfixed hw or sw issue that corrupts hdd data on its way to borg.
Since the error, I have not changed anything on the backup and have not added a new one. Where could new errors come from?
The best I can do now is start a borg check --repair REPO
?
edit2:
Another question: I have the same backup (a few months older, but also 4,2TB) on another server with an HDD and Intel Atom D510. Here the borg check REPO
(without further parameters) has been running for almost 24 hours (one CPU at 100%). No error or message has been issued yet, can I see how long this will take? Is it normal that it takes so long?
edit3: I was heard, after just over 24 hours it was finished.
& sudo borg check /media/sicherungen/uNAS/
371 orphaned objects found!
Archive consistency check complete, problems found.
I never run a borg compact
so the orphaned objects come from this?
And these are also the problems?
Errors can also come from HW and SW malfunctioning, like RAM or CPU issues, maybe the storage controller or even cables. Kernel and filesystem could also be the culprit.
As long as there is such an error source present (e.g. bad RAM), borg check --repair could do more harm than good.
Orphaned objects are harmless (== chunks that are not used by any archive) and nothing to worry about.
@ThomasWaldmann
I copied the complete backup to another server (on another hdd).
There I started a borg check --repair
, this time with version 1.2.8. But this is running for over 50h, but only ~70% of the segments are checked, and since segment 11367 all segments are mismatching (this wasn't the last time, at the old server, with check
without --repair
).
Doe's it make sense to let it run? Or do I waste my time?
There is quite a lot of corruption in these segment files. That checksum (CRC32) should be always correct.
So either you had a systematic read problem when copying the files to the new location or the on-disk content on the old location is already badly corrupted.
It's hard to say whether it makes sense to continue, because one can not easily know how your backups are distributed over the segments or how much percentual damage there is. But there is a lot of damage.
BTW, when upgrading to borg > 1.2.4 you must read and follow the important note at the top of the changelog. In your case, I guess you should not try to upgrade (1.2.4 would be fine though) and fix your repo issues at the same time.
The checksum mismatches are not related to the upgrade though.
Thank you for everything!
BTW, when upgrading to borg > 1.2.4
https://github.com/borgbackup/borg/blob/1.2.4/docs/changes.rst#version-124-2023-03-24 By this you mean the upgrade from version 1.1.x to version >1.2.4? I'm pretty sure I was already using version 1.2.0 before. But run again a check, compact --cleanup-commits, check and then info!
You also need to follow these notes when upgrading 1.2.x (x <=4) to 1.2.y (y > 4).
Hi folks,
I recently had a hardware failure in one of my servers. The HDDs probably had power failures from time to time, because there was too high a load on one lane of the power supply (presumably). Unfortunately this went on for 2 days.
Afterwards I was able to restore the RAID5 partition with 3 of 4 (plus a spare), so far ok. I then mounted the other two HDDs again and waited until the resync was finished. Unfortunately,
fsck
still showed many errors that could not be completely rectified:(it doesn't matter whether I say yes or no, the errors remain)
The
borg check
is also not very successful:Is there a possibility to find all backup points that are still without problems and delete the rest?
System:
Thank you very much!