isogr / register-system-transition

Covers GR system transition from 2013 Java-based version to 2023 static site based version
1 stars 0 forks source link

Files encoded as ASCII with Unicode entities #15

Closed strogonoff closed 5 months ago

strogonoff commented 5 months ago

In this file, there’s this line:

  - "\xB5m/m/a"

Since files are encoded in UTF-8, it can be written normally:

  - "µm/m/a"

Not sure yet whether it may cause bugs in future. The string is displayed correctly on the front-end for now, I suppose JS YAML deserializer handles these entities.

strogonoff commented 5 months ago

May be related: https://stackoverflow.com/questions/10648614/dump-in-pyyaml-as-utf-8

stefanomunarini commented 5 months ago

I ran the script allowing for unicode characters and tested the dataset using Paneron, and everything is working correctly. Pushed the data to the gr-registry and updated the dump script.