Drekin / win-unicode-console

A Python package to enable Unicode support when running Python from Windows console.
MIT License
103 stars 12 forks source link

can not handle the output from python build-in lib(while using argparse) #30

Closed eromoe closed 8 years ago

eromoe commented 8 years ago

I have below code :

from __future__ import unicode_literals, absolute_import

......

def main():
    global ROOT_PATH
    parser = argparse.ArgumentParser()
    parser.add_argument('-p','--path', default='__here__')
    print 1
    args = parser.parse_args()
    print 2
    p = unicode(args.path, 'gbk')
    if p == '__here__':
        ROOT_PATH = os.getcwd()
    else:
        ROOT_PATH = p

    for sf in get_sub_folders(ROOT_PATH):
        process_folder(sf)

My windows7 cmd encoding is gbk(a Chinese encoding)

Chinese work fine

E:\[Sync]\project\auto_shift>python exe.py -p E:\[Sync]\【垃圾箱】
1
2

But Japanese go wrong.

E:\[Sync]\project\auto_shift>python exe.py -p E:\同人本\シュート・ザ・ムーン (フエタキシ)
1
usage: exe.py [-h] [-p PATH]
Traceback (most recent call last):
  File "exe.py", line 196, in <module>
    main()
  File "exe.py", line 183, in main
    args = parser.parse_args()
  File "D:\Python27\lib\argparse.py", line 1704, in parse_args
    self.error(msg % ' '.join(argv))
  File "D:\Python27\lib\argparse.py", line 2374, in error
    self.exit(2, _('%s: error: %s\n') % (self.prog, message))
  File "D:\Python27\lib\argparse.py", line 2361, in exit
    self._print_message(message, _sys.stderr)
  File "D:\Python27\lib\argparse.py", line 2354, in _print_message
    file.write(message)
  File "D:\Python27\lib\site-packages\win_unicode_console-0.4-py2.7.egg\win_unicode_console\streams.py", line 217, in wr
ite
    s = s.decode(self.encoding)
  File "D:\Python27\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa5 in position 40: invalid start byte
eromoe commented 8 years ago

I think I found the problem :

After I remove win_unicode_console.enable() from usercustomize.py.

E:\[Sync]\project\auto_shift>python exe.py -p E:\同人本\シュート・ザ・ムーン (フエタキシ)
1
usage: exe.py [-h] [-p PATH]
exe.py: error: unrecognized arguments: (フエタキシ) 

The argparse would throw an error, this error msg can not be decode.

Then add usercustomize.py. back, and use the correct input:

E:\[Sync]\project\auto_shift>python exe.py -p "E:\同人本\シュート・ザ・ムーン (フエタキシ)"
1
2

Seems win-unicode-console can not handle the output from python build-in lib?

Drekin commented 8 years ago

Generally, there may problem with giving Unicode argument to a Python 2 script, the bytes in sys.argv may even not be a faithful representation of the original string. See #20 . Maybe I'll add code like that to win_unicode_console.

anthrotype commented 8 years ago

Yeah, that would be great!

eromoe commented 8 years ago

+1

Drekin commented 8 years ago

There is some progress with #20. But for your particular code, you may turn sys.argv to Unicode before parsing rather than after:

args = [unicode(arg, "gbk") if not isinstance(arg, unicode) else arg for arg in sys.argv]
args = parser.parse_args(args)