Open thiakil opened 6 years ago
I agree with this. One caution, note that the handling of character encoding in the shepherd program chain presumes iso-8859-1 so the whole chain needs updating not just the grabbers or wierdness occurs. But this should be a moderate priority.
I think that the grabbers should be modified to output utf-8, instead of iso-8859-1, as there are some characters that aren't encodable in such.
It's rare but as an example, some SBS texts contain typographical quotes (
U+2018
,U+2019
), which don't exist directly in the iso encoding.