Open Ellerbrok opened 8 months ago
I also tried the most recent Mercurial version:
** Mercurial version (6.5.1). TortoiseHg version (6.5.1)
** Command:
** CWD: C:\Program Files\TortoiseHg
** Encoding: cp1252
** Extensions loaded: mercurial_keyring unknown, rebase, strip, tortoisehg.util.configitems, win32lfn
** Python version: 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]
** Windows version: sys.getwindowsversion(major=6, minor=2, build=9200, platform=2, service_pack='')
** Processor architecture: x64
** Qt-5.15.2 PyQt-5.15.7 QScintilla-2.13.3
Traceback (most recent call last):
File "tortoisehg\hgqt\cmdui.pyc", line 649, in runCommand
File "tortoisehg\hgqt\update.pyc", line 398, in runCommand
File "tortoisehg\hgqt\update.pyc", line 342, in isclean
File "mercurial\context.pyc", line 1460, in modified
File "mercurial\util.pyc", line 1760, in __get__
File "mercurial\context.pyc", line 1425, in _status
File "mercurial\localrepo.pyc", line 3408, in status
File "mercurial\context.pyc", line 432, in status
File "mercurial\context.pyc", line 2001, in _buildstatus
File "mercurial\context.pyc", line 1906, in _dirstatestatus
File "mercurial\dirstate.pyc", line 1681, in status
File "mercurial\dirstate.pyc", line 1505, in walk
File "mercurial\windows.pyc", line 599, in statfiles
File "C:/Program Files/TortoiseHg/win32lfn.py", line 116, in fn
path = stringtobytes(uncabspath(args[0]))
File "C:/Program Files/TortoiseHg/win32lfn.py", line 97, in uncabspath
path = bytestostring(path)
File "C:/Program Files/TortoiseHg/win32lfn.py", line 377, in bytestostring
string = string.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdc in position 43: invalid continuation byte
Hmm. I guess something isn't encoded as utf-8 in your repo that was in mine 😕 I wonder if it could be related to being on Windows 8 🤔 What happens if you change line 377 from string = string.decode('utf-8')
to string = string.decode('latin-1')
? It's been a while since I worked on this and I never fully understood it to begin with so I can't say whether that's likely to work, but it's worth a shot.
If that works, or if that at least changes the error, we might need to change that part to try decoding as utf-8 and if that fails decode as something else. Or maybe Python has a way to properly detect the encoding of a string, if such a thing is possible. I'm not actually sure what data is being passed to that function, so it's a bit tricky to know what it should be doing exactly.
Or, if you have a Windows 10 box it might be worth testing whether your repo and this extension works there. My suspicion is that Windows 10 may be handling things as unicode where Windows 8 still returned directory listings in older encodings, or something along those lines.
Hi, in fact this is Windows 11. Maybe somthing in the Repository is utf16?
I found something that might help, but I have not testet this in the py file because I have no experience with Python.
def force_decode(string, codecs=['utf8', 'cp1252', 'latin-1', 'utf16' ]): for i in codecs: try: return string.decode(i) except UnicodeDecodeError: pass
for item in os.listdir(rootPath):
if isinstance(item, str):
item = force_decode(item)
print item
How strange! The log is reporting Windows 8 (version 6.2 is Windows 8, as is build 9200).
Ok, try changing this bit at line 375:
def bytestostring(string):
if isinstance(string, bytes):
string = string.decode('utf-8')
return string
...to this:
def bytestostring(string):
if isinstance(string, bytes):
string = force_decode(string)
return string
def force_decode(string, codecs=['utf8', 'cp1252', 'latin-1', 'utf16' ]):
for i in codecs:
try:
return string.decode(i)
except UnicodeDecodeError:
pass
...and see if that helps. I have no real experience in Python except for the occasional Blender script so I can't say if this is right, but I think it should work.
Hi there,
there seems to be an issue with utf-8 in here. After installling the extension to Mercurial I get the following error message if I try to "Update" to a newer Revision of my repository.