jborg / attic

Deduplicating backup program
Other
1.11k stars 104 forks source link

attic check --repository-only reports "Error reading segment ###" at different locations #176

Open tgharold opened 9 years ago

tgharold commented 9 years ago

This is with Attic 0.13 - the "error reading" point is different across multiple runs when checking a repository that is accessed and backed up via SSH. Multiple runs produce different error locations.

attic check --repository-only ssh://example.com/backup/beta/athens/attics/smb-software.attic

Starting repository check... Error reading segment 6322 Index object count mismatch. 967476 != 967402 attic: Exiting with failure status due to previous errors

Running it again produces the output:

Starting repository check... Error reading segment 2531 Error reading segment 4117 Index object count mismatch. 967476 != 967319 attic: Exiting with failure status due to previous errors

On the third run, it reports a different segment number as the culprit:

Starting repository check... Error reading segment 7980 Index object count mismatch. 967476 != 967407 attic: Exiting with failure status due to previous errors

Running it a 4th time, gives yet another different result:

Starting repository check... Error reading segment 300 Error reading segment 1263 Index object count mismatch. 967476 != 967291 attic: Exiting with failure status due to previous errors

Running the check locally on the remote system results in a different result:

attic check --repository-only /backup/beta/athens/attics/smb-software.attic

Starting repository check... Error reading segment 4985 Index object count mismatch. 967476 != 967404 attic: Exiting with failure status due to previous errors

Approximate details about this particular repository is:

Initializing cache...

Archive name: daily-20150113-1800 Archive fingerprint: c80f6ede0b77be9e63f5cfb78bf5031d50d671e333b4ce265c0705a4d45d7061 Start time: Tue Jan 13 18:00:05 2015 End time: Tue Jan 13 22:25:17 2015 Duration: 4 hours 25 minutes 12.22 seconds Number of files: 157115

                   Original size      Compressed size    Deduplicated size

This archive: 72.53 GB 65.00 GB 52.44 GB

All archives: 72.53 GB 65.00 GB 52.44 GB

So it's only a modest sized backup. The "index.#####" file is only 81MB. There are about 9900 segment files under the data/0 directory in the repository.

ThomasWaldmann commented 9 years ago

@tgharold can you reproduce with 0.14 also?

tgharold commented 9 years ago

I'll have to give it a try. What I found was re-running "attic --check --repository-only" would constantly give errors at different locations as shown. It wasn't until I ran "attic --check" without the "--repository-only" flag that attic was able to fix the repository and find the broken chunk. Once the chunk was fixed, things worked fine.

narkisr commented 9 years ago

This happens to me in 0.1.4:

Starting repository check... Error reading segment 13478 Error reading segment 14429 Index object count mismatch. 1279463 != 1279310 attic: Exiting with failure status due to previous errors

ThomasWaldmann commented 9 years ago

Is the corruption maybe related to this?: msgpack/msgpack-python#124

@tgharold @narkisr : Please state the msgpack version you used in your test and whether it used the compiled C version of msgpack or the pure-python version of msgpack.

whiteadam commented 9 years ago

I think I might be having this issue also. If it helps, i'm using the pre-compiled version 0.15 with msgpack._packer.so

Sorry if that's not the right info, I'm not a big C or Python guy :)

Starting repository check...
Error reading segment 0
Error reading segment 1
Error reading segment 2
Error reading segment 3
Error reading segment 4
Error reading segment 5
Error reading segment 6
Error reading segment 7
Error reading segment 8
Error reading segment 9
Index object count mismatch. 614319 != 613381
attic: Exiting with failure status due to previous errors
whiteadam commented 9 years ago

I've figured out with mine that it pretty much happens daily. I have 13 servers backing up over SSH to a central server, with a disk on iSCSI.

I'm not sure how to prevent/remedy these errors on a daily basis yet though.

tgharold commented 9 years ago

This still happens in 0.16. Running "attic check" against the repository results in different segment numbers being reported.

MsgPack version on the origin machine (the one being backed up):

Installed Packages
Name        : python-msgpack
Arch        : x86_64
Version     : 0.4.6
Release     : 1.el6
Size        : 243 k
Repo        : installed
From repo   : epel
Summary     : A Python MessagePack (de)serializer
URL         : http://pypi.python.org/pypi/msgpack-python/
License     : ASL 2.0
Description : MessagePack is a binary-based efficient data interchange format that is
            : focused on high performance. It is like JSON, but very fast and small.
            : This is a Python (de)serializer for MessagePack.

And on the destination machine (housing the repository):

Installed Packages
Name        : python-msgpack
Arch        : x86_64
Version     : 0.4.6
Release     : 1.el6
Size        : 243 k
Repo        : installed
From repo   : epel
Summary     : A Python MessagePack (de)serializer
URL         : http://pypi.python.org/pypi/msgpack-python/
License     : ASL 2.0
Description : MessagePack is a binary-based efficient data interchange format that is
            : focused on high performance. It is like JSON, but very fast and small.
            : This is a Python (de)serializer for MessagePack.

Now, it's possible that corruption crept in before the upgrade to 0.16. I'm running a --repair now.