Closed ghost closed 9 years ago
Could be an error in the ExitStack backport I use on Python 2.7. Let me know if it also happens on Python 3.
rebuilt with python3.3 that is included in ubuntu. Build worked but couldn't run because of issue 27.
~/dev/bedup$ sudo python3.3 ~/.local/bin/bedup dedup /data
Traceback (most recent call last):
File "/home/weirdtalk/.local/bin/bedup", line 9, in <module>
load_entry_point('bedup==0.9.0', 'console_scripts', 'bedup')()
File "/home/weirdtalk/.local/lib/python3.3/site-packages/bedup-0.9.0-py3.3-linux-x86_64.egg/bedup/__main__.py", line 487, in script_main
sys.exit(main(sys.argv))
File "/home/weirdtalk/.local/lib/python3.3/site-packages/bedup-0.9.0-py3.3-linux-x86_64.egg/bedup/__main__.py", line 476, in main
return args.action(args)
File "/home/weirdtalk/.local/lib/python3.3/site-packages/bedup-0.9.0-py3.3-linux-x86_64.egg/bedup/__main__.py", line 147, in vol_cmd
[volpath], tt, recurse=True)
File "/home/weirdtalk/.local/lib/python3.3/site-packages/bedup-0.9.0-py3.3-linux-x86_64.egg/bedup/filesystem.py", line 590, in load_vols
lo, sta = vol._fs._load_visible_vols([volpath], nest_desc=True)
File "/home/weirdtalk/.local/lib/python3.3/site-packages/bedup-0.9.0-py3.3-linux-x86_64.egg/bedup/filesystem.py", line 274, in _load_visible_vols
os.path.join(start_desc.description, relpath),
File "/usr/lib/python3.3/posixpath.py", line 92, in join
"components.") from None
TypeError: Can't mix strings and bytes in path components.
I'm having a similar error with Python 3.4:
File "/usr/lib64/python3.4/site-packages/contextlib2.py", line 244, in _invoke_next_callback
suppress_exc = _invoke_next_callback(exc_details)
File "/usr/lib64/python3.4/site-packages/contextlib2.py", line 244, in _invoke_next_callback
suppress_exc = _invoke_next_callback(exc_details)
File "/usr/lib64/python3.4/site-packages/contextlib2.py", line 244, in _invoke_next_callback
suppress_exc = _invoke_next_callback(exc_details)
File "/usr/lib64/python3.4/site-packages/contextlib2.py", line 244, in _invoke_next_callback
suppress_exc = _invoke_next_callback(exc_details)
File "/usr/lib64/python3.4/site-packages/contextlib2.py", line 244, in _invoke_next_callback
suppress_exc = _invoke_next_callback(exc_details)
File "/usr/lib64/python3.4/site-packages/contextlib2.py", line 244, in _invoke_next_callback
suppress_exc = _invoke_next_callback(exc_details)
File "/usr/lib64/python3.4/site-packages/contextlib2.py", line 244, in _invoke_next_callback
suppress_exc = _invoke_next_callback(exc_details)
RuntimeError: maximum recursion depth exceeded
Ping. Any chance to fix this? Otherwise it's impossible to use bedup when you have too many files of the same size (say, PostgreSQL installed).
Jep same here on deduplicating some SVN workspaces :(
+1
same to me .... max recursion depth exeeded ... on debian and ubuntu 14.04 .... what could we do ?
I contacted the developer of contextlib2
. He is aware of the problem, there is a fix but he haven't found the time to pack a new release so far.
Do you mean https://bitbucket.org/ncoghlan/contextlib2/commits/170d5144455767dc39065f804d18df2104df1b0c? Sadly, that seems to be only possibly relevant commit after the last release, and it's around 1.5yr since it was committed. contextlib2
seems pretty much dead to me.
so bedup development is dead? where is the light at the end...?
I switched over to duperemove. It's not perfect on larger sets but you can run it on folders so i just do smaller chunks. At least i never had immutable files left over or OOM issues as i had with bedup again and again.
good, that is what I do now. hope it work for my 80tb store.
I am deduplicating source code branches so i use a blocksize of 8K while duperemove usually uses 128K. For really large storage increase that maybe. With 8K even my 200GB deduped to 20GB generate way above 2GB of hashes and duperemove uses memory accordingly.
where are the hashes stored? only in memory? if the hashes are generated one time are they used again with delta support? i checked out duperemove and it is version 0.10-dev...i think i remember some oom´s with duperemove 0.08 on my 80tb store
I did not check the code but it looks like the hashes are stored in memory. You can write them to a file but still they will be loaded to memory again as a whole. At least that is what it looks like. From the issues there it looks like they want to move that to sqllite which could help to drop the memory consumption a lot i guess. Guess it's better to directly ask there.
ok i tested now ... my 80tb store consumed 128gb ram with duperemove read only ... my server has 256gb ram so no problem for me :) the really great thing about duperemove is the multithreaded hashing so with 24cores and 256gb ram it works.
bedup is now Python3-only (way overdue), and doesn't use contextlib2 anymore.
On Ubuntu 13.10 with 1 week old btrfs raid10 w/ lzo compression on 4x2TB w/ ~2TB data and ~10 snapshots. Was curious to see what the saving of deduplication would be so cloned master and I ran:
sudo ./dedup /data
for about 24hrs and reclaimed ~80GB before getting this error (omitting duplicate lines, sorry don't have the top of the trace):
File "~/.local/lib/python2.7/site-packages/contextlib2.py", line 244, in _invoke_next_callback suppress_exc = _invoke_next_callback(exc_details) File "~/.local/lib/python2.7/site-packages/contextlib2.py", line 244, in _invoke_next_callback suppress_exc = _invoke_next_callback(exc_details) File "~/.local/lib/python2.7/site-packages/contextlib2.py", line 246, in _invoke_next_callback suppress_exc = cb(sys.exc_info()) File "~/.local/lib/python2.7/site-packages/contextlib2.py", line 171, in _exit_wrapper return cm_exit(cm, exc_details) RuntimeError: maximum recursion depth exceeded
My curiosity is satiated for now, but let me know if you could use any more details or would like me to try and reproduce the error again.