ShephedProject / shepherd

Shepherd delivers reliable, high-quality Australian TV guide data (EPG).
Other
20 stars 14 forks source link

UTF-8 #3

Open thiakil opened 6 years ago

thiakil commented 6 years ago

I think that the grabbers should be modified to output utf-8, instead of iso-8859-1, as there are some characters that aren't encodable in such.

It's rare but as an example, some SBS texts contain typographical quotes (U+2018, U+2019), which don't exist directly in the iso encoding.

perkins1724 commented 6 years ago

I agree with this. One caution, note that the handling of character encoding in the shepherd program chain presumes iso-8859-1 so the whole chain needs updating not just the grabbers or wierdness occurs. But this should be a moderate priority.