google-code-export / tovid

Automatically exported from code.google.com/p/tovid
1 stars 0 forks source link

RFE: proper support for i18n / accented characters #133

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,
using tovid in a European locale, I've been having problems using accented 
characters. Accented characters are basically those outside the 7-bit ASCII 
specification. In Germany, umlauts are quite commonly used: äÄöÖüÜß

While this works from the command line:
> tovid todisc -files input.mpg -titles "This has Umläüts" -out DVD -dvd -pal 
-ffmpeg
tovid titlesets will also accept them in its input fields, but their processing 
will fail along the way. On IRC, I had already proposed one or two "unicode()" 
workarounds which have found their way into SVN, but they don't cover 
everything.

Solution:
- wrap relevant code sections with some kind of to_unicode()/to_utf8(), as 
outlined here: http://farmdev.com/talks/unicode/
- take similar code from yum

Problems:
- You can't just "switch the app" to behave properly (except if you use 
Python3, according to the docs, it should "just work").
- You have to find affected code sections.

I have muddled my way through "tovid titlesets", using accented characters in 
titleset and thumbnail menues. I inserted appropriate conversions at every 
failing point. With the resulting attached patch, I got all the way through the 
wizard without failure and with the correct result on DVD.

I was using a en_US.UTF-8 locale. I don't actually know whether Python or X11 
are interested, and what affect it has on the input fields in X (or tkinter). I 
didn't try non-UTF-8 locales (such as latin1 of Japanese), though my guess is 
that they should transparently work.

Here are the caveats of this patch I can think of:
- Other locales: See above.
- Test coverage: I have only gone through this scenario with two titlesets of 
two videos each.
- File names: I did not try file names with accented characters yet. I have no 
idea whether they are transparent to Python (they better be, otherwise it's 
asking for trouble!) or need special handling.
- There are probably function calls in other places which I haven't hit, which 
require the same wrapping. This needs code review!
- setup_locale() may actually be redundant or a no-op; omission of it is yet to 
be tested.

Please do give the attached patch (against tovid SVN r3301) a go and at least 
check whether it breaks anything for you. If not, it might be a candidate for 
inclusion despite the caveats. :-)

Apart for that, this should be inspiration for proper implementation. I have 
tried my best to go along the docs.

Thanks.
P.S.: I'm gibson/gibson_ on IRC's #tovid.

Original issue reported on code.google.com by mbst...@googlemail.com on 29 Sep 2010 at 3:13

Attachments: