hunkim / translation_coverage

Automatically check the rates between alpha VS other (unicode)
6 stars 2 forks source link

Output test #14

Closed hunkim closed 8 years ago

hunkim commented 8 years ago

@mingrammer it seems the output test fails on python3.x due to some differences in the text handling? Can you look into it?

mingrammer commented 8 years ago

Ok, I'll check it

mingrammer commented 8 years ago

@hunkim I found the problem. Yes it's about text handling differences between both versions. In Python3, default encoding is 'utf8', so it's 'len' function counts characters in string by really 'one' character. but, that of Python2 counts the characters by 'bytes' length. for example, '한' is counted 1 in Python3, but 3 in Python2

Ye, Python3's that is really we want! So, you should modify the sample.md has wrong count numbers.

And solution is this:

s = myfile.read() # return 'str' type value

if type(s) == bytes: 
    s = s.decode('utf8') # for python2

note that: In Python3: str != bytes In Python2: str == bytes

mingrammer commented 8 years ago

Why .. ?

hunkim commented 8 years ago

I see. It's complicated. I just made this test a bit weak for now.

mingrammer commented 8 years ago

Can I fix that?

hunkim commented 8 years ago

Why? s.decode('utf8') may throw exceptions, right?

hunkim commented 8 years ago

Can I fix that? Please! I have merged this to make it easy for you to fix.

mingrammer commented 8 years ago

Ye, I'll fix it.

hunkim commented 8 years ago

Thanks. On Wed, 12 Oct 2016 at 12:44 PM, ming notifications@github.com wrote:

Ye, I'll fix it.

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/hunkim/translation_coverage/pull/14#issuecomment-253117816, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3DV8KHYp5tzPq2AiPfAj-Dsrp1s97Hks5qzGXLgaJpZM4KUQog .