jborg / attic

Deduplicating backup program
Other
1.11k stars 104 forks source link

OverflowError: value too large to convert to int #326

Open tgharold opened 9 years ago

tgharold commented 9 years ago

Attic 0.16 (Python 3.4)

This occurred during "attic check" (0.16 on both ends).

This particular backup has 172 archives, 174GB, original size for all of the archives is stated as 18.14TB.

Traceback (most recent call last):
  File "/root/build/prefix/lib/python3.4/site-packages/attic/remote.py", line 301, in get_many
  File "attic/hashindex.pyx", line 96, in attic.hashindex.NSIndex.__getitem__ (attic/hashindex.c:2171)
KeyError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/build/prefix/lib/python3.4/site-packages/cx_Freeze/initscripts/Console.py", line 27, in <module>
  File "prefix/bin/attic", line 3, in <module>
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archiver.py", line 730, in main
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archiver.py", line 720, in run
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archiver.py", line 85, in do_check
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archive.py", line 547, in check
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archive.py", line 696, in rebuild_refcounts
  File "/root/build/prefix/lib/python3.4/site-packages/attic/archive.py", line 673, in robust_iterator
  File "/root/build/prefix/lib/python3.4/site-packages/attic/remote.py", line 305, in get_many
  File "/root/build/prefix/lib/python3.4/site-packages/attic/remote.py", line 291, in store_object
  File "attic/hashindex.pyx", line 102, in attic.hashindex.NSIndex.__setitem__ (attic/hashindex.c:2280)
OverflowError: value too large to convert to int
ThomasWaldmann commented 9 years ago

https://github.com/jborg/attic/blob/master/attic/remote.py#L291 this is the line crashing.

so, the file offset goes beyond 2^31-1 (max. signed 32bit int) here.

ThomasWaldmann commented 9 years ago

some more thoughts:

it looks like the RepoCache temp file (where it caches chunks received from the remote repo server, so it does not have to transfer them again in case they are referenced again) grows beyond 2GB.

that could be solved by just using multiple, numbered files (as done in the repo) and store number + offset into the index. or use 64bit offset.

but even if doing that, this stuff could still cause troubles as people will run out of /tmp space (many linux dists put that on a tmpfs in RAM and even on disk, /tmp space might be limited).

so, use some LRU approach? make it possible to switch off that cache? check the chunks index for the most-used chunks and only cache them?

tgharold commented 9 years ago

Definitely seems to be related to the size of the directory created in /tmp, once that file exceeds 2GB, attic check will crash with the overflow error.

ThomasWaldmann commented 9 years ago

Seems like this affects:

chrj commented 8 years ago

I'm also hit by this. This particular repository has 222GB of data in ~300 archives with 130TB of total, uncompressed, duplicated data. Attic is currently tracking roughly 3M files in the source data.

I recently moved the attic client to a new server and I guess the initial cache sync broke it.

Let me know if you need me to test anything. I'll keep the old repository around for a couple of months.

My immediate fix was to start over on a new repository.

I'm running Debian Jessie on both ends with Attic 0.16 and Python 3.4.2.

ghost commented 8 years ago

i'm also affected by this. I get this error when verifying archive with FUSE and rsync --numeric-ids -aHAXScvx --delete --dry-run. I'm running Attic 0.16 on Ubuntu 14.04 amd64.

$ uname -a
Linux haramaki 4.2.0-34-generic #39~14.04.1-Ubuntu SMP Fri Mar 11 11:38:02 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.4 LTS
Release:    14.04
Codename:   trusty

$ python --version
Python 3.4.3

$ pip list
Attic (0.16)
llfuse (0.41.1)
msgpack-python (0.4.7)
pip (1.5.4)
setuptools (2.2)
Traceback (most recent call last):
  File "/opt/attic/lib/python3.4/site-packages/attic/remote.py", line 301, in get_many
    yield self.load_object(*self.index[key])
  File "attic/hashindex.pyx", line 96, in attic.hashindex.NSIndex.__getitem__ (attic/hashindex.c:2171)
KeyError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/attic/bin/attic", line 3, in <module>
    main()
  File "/opt/attic/lib/python3.4/site-packages/attic/archiver.py", line 730, in main
    exit_code = archiver.run(sys.argv[1:])
  File "/opt/attic/lib/python3.4/site-packages/attic/archiver.py", line 720, in run
    return args.func(args)
  File "/opt/attic/lib/python3.4/site-packages/attic/archiver.py", line 265, in do_mount
    operations.mount(args.mountpoint, args.options, args.foreground)
  File "/opt/attic/lib/python3.4/site-packages/attic/fuse.py", line 230, in mount
    llfuse.main(single=True)
  File "llfuse/fuse_api.pxi", line 319, in llfuse.capi.main (src/llfuse/capi_linux.c:25074)
  File "llfuse/handlers.pxi", line 323, in llfuse.capi.fuse_read (src/llfuse/capi_linux.c:9392)
  File "llfuse/handlers.pxi", line 324, in llfuse.capi.fuse_read (src/llfuse/capi_linux.c:9342)
  File "/opt/attic/lib/python3.4/site-packages/attic/fuse.py", line 204, in read
    chunk = self.key.decrypt(id, self.repository.get(id))
  File "/opt/attic/lib/python3.4/site-packages/attic/remote.py", line 294, in get
    return next(self.get_many([key]))
  File "/opt/attic/lib/python3.4/site-packages/attic/remote.py", line 305, in get_many
    self.store_object(key, data)
  File "/opt/attic/lib/python3.4/site-packages/attic/remote.py", line 291, in store_object
    self.index[key] = offset - len(data), len(data)
  File "attic/hashindex.pyx", line 102, in attic.hashindex.NSIndex.__setitem__ (attic/hashindex.c:2280)
OverflowError: value too large to convert to int
madssj commented 8 years ago

@superbacker you should probably use borg backup - a fork of attic that has fixed many, many bugs such as this one.

Attic doesn't seem to be maintained anymore.

ghost commented 8 years ago

@madssj Thx, I'll try.