soundasleep / iaml

Automatically exported from code.google.com/p/iaml
3 stars 1 forks source link

htmltext Patch: Special characters prepended with A-Grave character #283

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
FIX: HTML Special Characters (those with a leading ampersand) and being 
prepended with the A-Grave character. This fix removes all A-Grave characters 
that are directly in front of another HTML special character. 
Other suggestion: add table and tr tags to the add newline after switch and add 
table to the add newline before switch. 

Original issue reported on code.google.com by seth.amm...@sendgrid.com on 15 Jan 2012 at 6:31

Attachments:

GoogleCodeExporter commented 9 years ago
// typo: ..."and being prepended"... => ..."are being prepended"...

Original comment by seth.amm...@sendgrid.com on 15 Jan 2012 at 6:33

GoogleCodeExporter commented 9 years ago
Are the  characters there due to incorrect encoding of the HTML file? e.g. 
you are trying to load a file originally encoded UTF-8, but you have now opened 
it using ISO-8859-1.

See: 
http://stackoverflow.com/questions/1461907/html-encoding-issues-character-showin
g-up-instead-of-nbsp, http://ask.metafilter.com/71246/What-the-%C3%82

html2text doesn't deal with character encoding at all (it assumes the input 
source text is already under the correct encoding).

Alternatively, can you post an example of some text that fails to load 
correctly, to go with the patch (and to use in a test).

Original comment by soundasleep on 24 Jan 2012 at 4:55

GoogleCodeExporter commented 9 years ago
I think this is due to character encoding.

Original comment by soundasleep on 30 May 2014 at 12:38