ravin353 / android-daisy-epub-reader

Automatically exported from code.google.com/p/android-daisy-epub-reader
0 stars 0 forks source link

Accented characters displayed with question mark characters for DAISY content #25

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Open a DAISY 2.02 Audio book that contains accented characters (encoded 
using ISO 8859-1 or Windows-1252 XML file encodings) e.g. Norwegian Library 
books.

What is the expected output? 
Accented characters e.g. æ, ø, å should be displayed correctly.

What do you see instead?
A Diagonal containing a question-mark character.

Please use labels and text to provide additional information.

Original issue reported on code.google.com by julianharty on 26 Dec 2010 at 8:08

GoogleCodeExporter commented 8 years ago
The fix to this has taken between 5 and 10 hours, as I chased my tail trying to 
understand what the problem was (first) and secondly how to convince the SAX 
parser to apply the encoding. It transpired that the way we called the SAX 
parser was subtley different in DaisyParser.java (which didn't seem to apply 
the encoding) and SmilParser.java (which worked as desired). The hint (which I 
noticed in the final hour of debugging) was that ncc.html files in unsupported 
encodings (e.g. windows-1252) did NOT throw a SAX Parser Exception, while 
SmilFiles with the same encoding did throw this exception. Essentially I 
modified the code in DaisyParser (and the code that calls 
parseNccContents(...)) to match the way we create the InputSource in 
SmilParser.java. That did the trick.

The code is still ugly and needs cleaning up.

Original comment by julianharty on 26 Dec 2010 at 8:18