glossarist / iev-data

1 stars 1 forks source link

More Unicode normalization #158

Closed skalee closed 3 years ago

skalee commented 3 years ago

I'm adding more Unicode normalization. Specifically, characters from u2000 to u2006 are now replaced with regular spaces. At least u2002 occurs in the spreadsheets in 351-41-xx concepts.

I am going to merge it, but pinging @ronaldtse just in case. Here is these characters' meaning: https://www.compart.com/en/unicode/category/Zs.