Closed hunkim closed 8 years ago
Ok, I'll check it
@hunkim I found the problem. Yes it's about text handling differences between both versions. In Python3, default encoding is 'utf8', so it's 'len' function counts characters in string by really 'one' character. but, that of Python2 counts the characters by 'bytes' length. for example, '한' is counted 1 in Python3, but 3 in Python2
Ye, Python3's that is really we want! So, you should modify the sample.md
has wrong count numbers.
And solution is this:
s = myfile.read() # return 'str' type value
if type(s) == bytes:
s = s.decode('utf8') # for python2
note that: In Python3: str != bytes In Python2: str == bytes
Why .. ?
I see. It's complicated. I just made this test a bit weak for now.
Can I fix that?
Why? s.decode('utf8') may throw exceptions, right?
Can I fix that? Please! I have merged this to make it easy for you to fix.
Ye, I'll fix it.
Thanks. On Wed, 12 Oct 2016 at 12:44 PM, ming notifications@github.com wrote:
Ye, I'll fix it.
— You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/hunkim/translation_coverage/pull/14#issuecomment-253117816, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3DV8KHYp5tzPq2AiPfAj-Dsrp1s97Hks5qzGXLgaJpZM4KUQog .
@mingrammer it seems the output test fails on python3.x due to some differences in the text handling? Can you look into it?