tweecode / twine

UI for creating hypertext stories
http://twinery.org
655 stars 97 forks source link

Twine (and Twee) need to be URGENTLY transitioned to be unicode compliant #19

Open Philip-Sutton opened 12 years ago

Philip-Sutton commented 12 years ago

Chris Klimas says that Twine (and Twee?) have to be built with the ANSI version of wxPython. However the wxPython people say that the ANSI version of wxPython is being phased out. Twine (and Twee?) need to have their code upgraded to handle unicode. Chris Klimas says that building Twine (& Twee?) with unicode compliant wxPython caused problems in the past. A key strategic question: is it a good use of time to make the current code of Twine (& Twee) unicode compliant or would it be a better use of time to move to build Twine2 (with a substantially changed architecture)?

christopherliu commented 12 years ago

Some developer notes:

Twine can be run with a Unicode wxPython, but it runs into the issue described with https://github.com/tweecode/twine/issues/20. This means that someone with Python installed can run Twine with Unicode support, but it cannot be made into an .exe on Windows (might be a similar packaging problem on Mac as well). Fixing #20 should help with #19.

factorypreset commented 12 years ago

good find -- yes, let's try to do this as soon as possible.

Stormrose commented 12 years ago

I'm currently running my Twine on Python 2.7 Unicode and that is what I'm using to produce the test .zip and installers that I leave in my dropbox. There seems to be little issues in the changeover - most issue appear to be from the version upgrade from 2.5 to 2.7 instead. We do need this thoroughly tested though.

Philip-Sutton commented 12 years ago

Emmanuel,

Can we close this issue? Has anybody else got my buildexe.py to produce a dist/win32 folder containing all the goodies (i.e. a runnable twine.exe)

and

I'm currently running my Twine on Python 2.7 Unicode and that is what I'm using to produce the test .zip and installers that I leave in my dropbox. There seems to be little issues in the changeover - most issue appear to be from the version upgrade from 2.5 to 2.7 instead. We do need this thoroughly tested though.

I could switch to Python 2.7 Unicode and add the related wxPython and py2exe and then give the build a go? Is that the right thing to do?

And if I succeed then anyone could!

Philip

Date sent: Wed, 18 Apr 2012 19:06:47 -0700 From: Stormrose <reply+i-3701252-edfc82f8baa3e883fa74cb33b7e6ca2b869f9017-1549449@reply.github.co m> To: Philip-Sutton Philip.Sutton@green-innovations.asn.au Subject: Re: [twine] Twine (and Twee) need to be URGENTLY transitioned to be unicode compliant (#19)

[ Double-click this line for list subscription options ]

I'm currently running my Twine on Python 2.7 Unicode and that is what I'm using to produce the test .zip and installers that I leave in my dropbox. There seems to be little issues in the changeover - most issue appear to be from the version upgrade from 2.5 to 2.7 instead. We do need this thoroughly tested though.


Reply to this email directly or view it on GitHub: https://github.com/tweecode/twine/issues/19#issuecomment-5213442

christopherliu commented 12 years ago

Stormrose: Is this the version of 1.3.5 you sent on April 14th to the CodeDev list? I download it from Dropbox, and it doesn't appear to have support for Unicode source export. I'm guessing I missed something, which I'll check in the morning.

If you have a fork where py2exe works with Unicode on 2.7 that'd be fantastic for issue 20: https://github.com/tweecode/twine/issues/20

Stormrose commented 12 years ago

Ok. I guess I should say that my current builds are working with Unicode versions of Python and wxPython. Actually unicode support in the files themselves will need more investigation. Internally the debugger is telling me all my string data is unicode (not ansi) so there must be something else going on. My next question then: is unicode file support a 1.3.6 issue or not?

--Et

On Thu, Apr 19, 2012 at 5:59 PM, Christopher Liu reply@reply.github.com wrote:

Stormrose: Is this the version of 1.3.5 you sent on April 14th to the CodeDev list? I download it from Dropbox, and it doesn't appear to have support for Unicode source export. I'm guessing I missed something, which I'll check in the morning.

If you have a fork where py2exe works with Unicode on 2.7 that'd be fantastic for issue 20: https://github.com/tweecode/twine/issues/20


Reply to this email directly or view it on GitHub: https://github.com/tweecode/twine/issues/19#issuecomment-5215432

Stormrose commented 12 years ago

Can somebody please send my an example .tws that demostrates the unicode problem? I'm writing characters in a few languages I know (and a few from google translate). Sure - the passage editor is a mess, but the export to .html sugarcane shows all my text exactly as I have written it. I guess I'm looking at the passage editor to improve that ... in the meantime. A demo .tws?

christopherliu commented 12 years ago

I think any Unicode tws's you have will work, it's just a matter of triggering the bug.

Here's what I've gotten so far:

1) When I load up the project as pure Python in Eclipse on 2.6 or 2.7, everything works fine. .tws files with Unicode characters can be properly exported to source code. 2) It's also possible to run buildexe.py py2exe, using the libraries in requirements.txt and wxPython-unicode-2.8. That generates a Twine.exe file fine. However, and this is the weird part, that version now has the bug previously found in issue 16 with Unicode export.

Strangest thing, really. Perhaps we need to trace what libraries are being loaded in each version, or perhaps something is being cached in py2exe that is not visible.

Update: I added the Python version to the about dialog. I don't think py2exe is getting the wrong version of that, but perhaps we could start there.

Stormrose commented 12 years ago

Would you be able to help me? Can you either send me the a .tws file that gives the problem you describe? Or see if you have the same problem when you run this test-build?

http://dl.dropbox.com/u/3655732/TwineTestBuildExe_1.3.5.201204200030.zip

Thanks.

On Fri, Apr 20, 2012 at 6:59 AM, Christopher Liu reply@reply.github.com wrote:

I think any Unicode tws's you have will work, it's just a matter of triggering the bug.

Here's what I've gotten so far:

1) When I load up the project as pure Python in Eclipse on 2.6 or 2.7, everything works fine. .tws files with Unicode characters can be properly exported to source code.

2) It's also possible to run buildexe.py py2exe, using the version I put in requirements.txt. That generates a Twine.exe file fine. However, and this is the weird part, that version now has the bug previously found in issue 16 with Unicode export.

Strangest thing, really. Perhaps we need to trace what libraries are being loaded in each version, or perhaps something is being cached in py2exe that is not visible.


Reply to this email directly or view it on GitHub: https://github.com/tweecode/twine/issues/19#issuecomment-5229291

christopherliu commented 12 years ago

Hey Stormrose,

I'm having the same problem with that build as the others. I just checked in a test file that I can't export from my machine. Can you?

Stormrose commented 12 years ago

Thanks for your help with this issue.

I opened up your file... no problems though it contained just one passage with two characters: open-double-quote and close-double-quote. (\u201c\u201d) I have been successfully working with slightly longer phrases. The latest build exported this into sugarcane with no problems. The web-browser correctly displayed this too and the .html source code contained the two correct characters. This is correct behaviour.

Did we want the .html file to have: “&#x\u201d; notation instead? (or &#decimal)? Technically the html spec does not require this, and it makes cut'n'paste from html code difficult. But it might be more robust to do the &# encoding - it prevents a non-unicode aware text editor from crapping all over your special characters.

On Fri, Apr 20, 2012 at 7:34 AM, Christopher Liu reply@reply.github.com wrote:

Hey Stormrose,

I'm having the same problem with that build as the others. I just checked in a test file that I can't export from my machine. Can you?


Reply to this email directly or view it on GitHub: https://github.com/tweecode/twine/issues/19#issuecomment-5230034

christopherliu commented 12 years ago

I used File -> Export Source Code... in Twine (based on issue 16). The Sugarcane and HTML export work.

And yeah, the HTML export should eventually support entities. We can treat that as another issue.

Stormrose commented 12 years ago

Thanks - for some reason I wasn't aware of that feature :) Ha! So new to this program. I'll take a look.

--Et

On Fri, Apr 20, 2012 at 2:52 PM, Christopher Liu reply@reply.github.com wrote:

I used File -> Export Source Code... in Twine (based on issue 16). The Sugarcane and HTML export work.

The HTML export should eventually support entities. We can treat that as another issue.


Reply to this email directly or view it on GitHub: https://github.com/tweecode/twine/issues/19#issuecomment-5236969

Stormrose commented 12 years ago

My latest pull requests go some way towards solving Unicode issues. Code Import/Export works pretty well.

There are some outstanding issues with the PassageEditor when viewing text not supported by the current OS locale. (windows-cp1251 for me) but it is non-destructive... in that you get placeholder characters and can safely edit around them without destroying those chars.

The Proofing copy (RTF) export works for cp1251. It seems RTF is not exactly the most internationalisation friendly format because it needs a codepage set to suit the target language. Currently that is cp1251 (hard coded) but we could make this an option for the story author and/or provide some "smart" default if there is a call.

Philip-Sutton commented 12 years ago

Currently that is cp1251 (hard coded) but we could make this an option for the story author and/or provide some "smart" default if there is a call.

Do we need a new call? I might have got the wriobng end of the stick, but I think the geenral call for unicode support/ non-standard characters support is the mandate we need?

It seems RTF is not exactly the most internationalisation friendly format because it needs a codepage set to suit the target language.

Given this, I think your suggestion that "we could make this an option for the story author and/or provide some "smart" default" seems like a really good way to go.

Given that people sometimes mix character sets - writing using the maths character set in Thai and English. Is it possible to activate several character sets at the same time?

Stormrose commented 12 years ago

We have the mandate but it becomes a matter of prioritising scarce time resource.

The RTF support for unicode definitely does need more investigation. I only gave it a couple of hours before concluding hastily that it was impossible. I'm willing to revise this opinion if somebody can point me to an explanation of the RTF file format that specifically talks about unicode support beyond a single code page. That is a good start would be to tell me what to write instead of: {\rtf1\ansi\ansicpg1251

We can't be the only Python project facing this issue.