Convert non-UTF8 encodings to UTF8

OpenVicProject / OpenVic-Dataloader

Dataloader submodule for OpenVic that is responsible for parsing both Paradox Victoria 2 data files and custom OpenVic data files.

MIT License

9 stars 2 forks source link

Loading v2script files currently does not convert the encoding to a common standard encoding which can display weird on some systems. It is necessary that v2script files produce a standard encoding so the file loading pipeline will produce consistent results, and alongside this we must support encodings like ASCII, Windows-1252, Windows-1251, and UTF8, so it makes the most sense then to convert non-UTF8 encodings to UTF8, which means we need to detect each particular encoding and then provide a kind of conversion database to UTF8, since for now we only plan to support Windows-1252 and Windows-1251 of the non-UTF8 compatible encodings, it is likely easier and cheaper for us to manage conversion on our own instead of seeking for a third-party library, however if the expectation is to support many more encodings instead of flatly rejecting them on detection, then it may become necessary to pull more from hsivonen/chardetng instead. ~~will become necessary instead to integrate unicode/icu. (See icu4c)~~

See

OpenVicProject / OpenVic-Dataloader

Convert non-UTF8 encodings to UTF8 #43

40

41

42