schacon / hg-git

mercurial to git bridge, pushed to directly from the hg-git plugin in Hg
GNU General Public License v2.0
620 stars 72 forks source link

UnicodeDecodeError exception in clone command #331

Open johannesgajdosik opened 5 years ago

johannesgajdosik commented 5 years ago

I try to use hg-git for cloning a large git repo. After running 28 hours it gives me a UnicodeDecodeError. I would rather like that the cloning command could finish successfully. In my opinion the cloning should not be impossible just because some file in the repo contains a non-ascii character. There is not even a hint which file has the non-ascii char.

I have hg-git 0.8.11 and Mercurial 4.5.2, gentoo Linux.

Please have a look at this and tell me how to proceed. Thanks in advance!

gaj@gajdosik /sda3 $ time hg clone git+ssh://xxxxx.xxx.xxx.xx:22/xxx/DefaultCollection/_git/XXXXXX destination directory: XXXXXX importing git objects into hg unknown exception encountered, please report by visiting https://mercurial-scm.org/wiki/BugTracker Python 2.7.14 (default, Jun 8 2018, 19:03:03) [GCC 6.4.0] Mercurial Distributed SCM (version 4.5.2) Extensions loaded: hgk, hggit Traceback (most recent call last): File "/usr/lib/python-exec/python2.7/hg", line 41, in dispatch.run() File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 88, in run status = (dispatch(req) or 0) & 255 File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 183, in dispatch ret = _runcatch(req) File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 324, in _runcatch return _callcatch(ui, _runcatchfunc) File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 332, in _callcatch return scmutil.callcatch(ui, func) File "/usr/lib64/python2.7/site-packages/mercurial/scmutil.py", line 154, in callcatch return func() File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 314, in _runcatchfunc return _dispatch(req) File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 918, in _dispatch cmdpats, cmdoptions) File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 673, in runcommand ret = _runcommand(ui, options, cmd, d) File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 926, in _runcommand return cmdfunc() File "/usr/lib64/python2.7/site-packages/mercurial/dispatch.py", line 915, in d = lambda: util.checksignature(func)(ui, *args, *strcmdopt) File "/usr/lib64/python2.7/site-packages/mercurial/util.py", line 1195, in check return func(args, kwargs) File "/usr/lib64/python2.7/site-packages/mercurial/commands.py", line 1449, in clone shareopts=opts.get('shareopts')) File "/usr/lib64/python2.7/site-packages/mercurial/hg.py", line 661, in clone streamclonerequested=stream) File "/usr/lib64/python2.7/site-packages/hggit/util.py", line 56, in inner return f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/hggit/init.py", line 354, in exchangepull pullop.cgresult = repo.githandler.fetch(remote.path, heads) File "/usr/lib64/python2.7/site-packages/hggit/git_handler.py", line 300, in fetch self.update_remote_branches(remote_name, result.refs) File "/usr/lib64/python2.7/site-packages/hggit/git_handler.py", line 1467, in update_remote_branches self.git.refs[ref_name] = sha File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 290, in setitem self.set_if_equals(name, None, ref) File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 625, in set_ifequals realnames, = self.follow(name) File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 226, in follow contents = self.read_ref(refname) File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 200, in read_ref contents = self.read_loose_ref(refname) File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 556, in read_loose_ref filename = self.refpath(name) File "/usr/lib64/python2.7/site-packages/dulwich/refs.py", line 482, in refpath name = name.decode(sys.getfilesystemencoding()) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 20: ordinal not in range(128)

real 1731m21.487s user 1043m52.445s sys 91m17.568s

tamsky commented 5 years ago

I've hit this error when tags or named refs contain unicode. One fix would which will avoid cloning any tags or refs, would be to:

which should let you succeed in cloning.

good luck, hopefully this helps

johannesgajdosik commented 5 years ago

Thanks for your reply and suggestion! The repository is huge and constantly used by many coworkers. My plan was not only to clone the repo with hg-git but to actually work with it every day. Removing the git tags and refs all over again each time I fetch or pull is not an option. Can you instead lead me to some peace of code that I can tweak so that the cloning can continue despite of having unicode tags and refs? Thanks in advance!