Closed zy6p closed 3 months ago
Thanks very much for this contribution! I made a few changes before merging to master.
I just published v0.6.75, which includes GB18030 detection. I noticed that the original function gave false positives on some other encodings, e.g. Windows 1256 (Arabic), so I added an additional requirement that a certain percentage of characters in the source data should be common Hanzi. If you find a GB18030 dataset that mapshaper fails to detect, please file a bug report!
I propose adding automatic GB18030 encoding detection to
mapshaper
, enhancing its handling of Chinese text in Shapefiles. GB18030, the most comprehensive Chinese character encoding, is essential for accurately processing modern Chinese datasets. This update aims to improvemapshaper
's utility for users dealing with Chinese geographic data by ensuring compatibility with a wider range of datasets, including those adhering to this mandated standard. Integrating GB18030 support not only advancesmapshaper
's capabilities in managing international datasets but also makes it more inclusive and user-friendly for a global user base.