Podshot / MCEdit-Unified

Combined MCEdit & Pymclevel repository.
http://podshot.github.io/MCEdit-Unified/
ISC License
483 stars 109 forks source link

Latest MCEdit crashes when finishing a selection in non-ASCII languages #418

Closed fhfuih closed 9 years ago

fhfuih commented 9 years ago

"Lastest version" includes released 1.3.3.0 and unreleased version in Python, but 1.3.2.0 and older don't have this bug. Only tested in Win.

In non-ASCII languages, specifically Chinese and Korean, the moment I finish a selection, it crashes. That means in the moment I click the 2nd selection corner or release the mouse after dragging out a selection. Error log in the console: 1

Error log in mcedit.log: https://gist.github.com/fhfuih/985f84514794f837d818

P.S.I also have some lines that appear even running in English. Maybe they're about another problem. console: 2 mcedit.log: included in the link above.

codewarrior0 commented 9 years ago

_() should not be allowed to return a str type. Only unicode.

When you use string.format(), it will always return a string of the same type - "foo{0}".format(u"bar") always results in a str, while u"foo{0}".format("bar") always results in a unicode. In the first case, it will implicitly encode the argument using ASCII, and this is where this error is coming from.

On line 31, _() is called several times - once for the format string, and a few more times for the arguments to format into it. The format string is being returned as a str while one of the arguments is a unicode containing non-ASCII characters, which triggers the implicit encoding in string.format() and raises the error.

naor2013 commented 9 years ago

Will be fixed for next release.. Thanks for the report and thanks for the info, CodeWarrior

codewarrior0 commented 9 years ago

No problem.

I still think _() should be changed further to strictly return only unicode types. Whenever it returns string, it's returning a str type which is the type of any string literals in the source code where _() was called. (That is, unless you do from __future__ import unicode_literals...)

Also, I'm curious why you didn't just use the gettext library for translations?

LaChal commented 9 years ago

More investigations will be made for this.

Concerning the from __future__ import unicode_literals it also caused some problems.

The library gettext uses a compiled resource format (.mo), which can't be edited. The editable and more complex .po resources have to be compiled with the gettext binaries to be tested... Moreover, gettext has difficulties to handle formatted strings. And some translatable strings in MCEdit-Unified aren't passed through the _() function. This is to avoid to use this function on every string we want to tranlate in the program. Instead, the GUI (albow) send them to _() as variables. gettext resource building tool will catch _("Some string"), but not _(some_string). Some strings are sent to _() outside the GUI, like the formatted ones.