Open garyvdm opened 10 years ago
This is what msysgit does: https://github.com/msysgit/git/commit/19d1e75d58d772329372d453ead964c813bbc6b6
Has this been resolved?
@garyvdm Has this been resolved?
No, not yet.
I also encountered this problem. see https://github.com/FriendCode/gittle/issues/72
UnicodeDecodeError When filename is"article/python2编码问题.md" or has unicode char
dulwich/index.py(423) build_index_from_tree()
-> full_path = os.path.join(prefix, entry.path)
(Pdb) pp prefix
u'E:/work/py/kkblog/article_repo/\u54c8\u54c8\\guyskk\\webhooks_test'
(Pdb) pp entry.path
'article/python2\xe7\xbc\x96\xe7\xa0\x81\xe9\x97\xae\xe9\xa2\x98.md'
(Pdb) os.path.join(prefix, entry.path)
*** UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 16: ordinal not in range(128)
My script:
# coding:utf-8
def pull_or_clone(dest, repo_url):
from giturlparse import parse
from gittle import Gittle
import os
p = parse(repo_url)
user_repo_path = os.path.join(dest, p.owner, p.repo)
if os.path.exists(user_repo_path):
repo = Gittle(user_repo_path, origin_uri=repo_url)
repo.pull()
else:
repo = Gittle.clone(repo_url, user_repo_path)
if __name__ == '__main__':
dest = u"E:/work/py/kkblog/article_repo/哈哈"
repo_url = u"https://github.com/guyskk/webhooks_test.git"
pull_or_clone(dest, repo_url)
It would be great if somebody could verify this still happens with Dulwich 0.20.3. The testsuite now passes on Windows, so if it still happens we can probably add a test & fix for it.
Steps to reproduce:
expected: 1 file named
À
(which isu'\u00c0'
) actual: the file is namedÀ
(which isu'\u00c3\u20ac'
)the file name is what you get if you do
u'\u00c0'.encode('utf8').decode('mbcs')
. mbcs it the default filesystem charter encoding used on windows.The git client handles this correctly. I'll take a look at their source code in the future to try figure out how they handle this.