Open logological opened 7 years ago
On reviewing the code and documentation, the code seems to be working (or not working, as the case may be) as intended. The post-x
"encoding" assumes ASCII input, not UTF-8, and is clearly documented as such. So what we are seeing here is a case of GIGO.
That said, there is a potential use case for being able to convert from UTF-8-encoded text that uses {pre,post}-{x,h,caret} transliteration, or HTML entities. The problem is that eoconv conflates transliteration schemes with computer character encodings. The proper solution is to allow the user to separately specify the input and output transliteration schemes, and the input and output character encodings.
Dmitry Bogatov reports that eoconv corrupts UTF-8-encoded text: