google / open-location-code

Open Location Code is a library to generate short codes, called "plus codes", that can be used as digital addresses where street addresses don't exist.
https://plus.codes
Apache License 2.0
4.09k stars 474 forks source link

Non-Roman Scripts #138

Closed LemmaEOF closed 5 months ago

LemmaEOF commented 6 years ago

Currently, Plus Codes only work with roman characters A-Z and arabic numerals 0-9. While this isn't an issue in any countries that use roman-scripted languages, it can be incredibly alienating to anyone whose primary or only languages use non-roman scripts. This can be a huge barrier to understanding and ease of use if users don't know or understand Roman script. What should Plus Codes do for non-Roman scripts? Should they just have character substitution or have a different system altogether?

drinckes commented 6 years ago

Hi @Boundarybreaker . Yeah, we thought a lot about this. We talked to a bunch of people from countries with other scripts, and they pointed out they used A-Z0-9 to enter URLs so didn't perceive it as an issue.

A-Z0-9 is either the first or second choice of input characters for most people, and we wanted, initially at least, to have a single representation (otherwise you end up navigating to 9GHJ+P8, seeing a sign with 9ГХЙ+П8, and wondering if you got to the right place or not).

At the moment we don't have any evidence that it's a significant barrier, and we've chosen an initial set of countries that include other scripts to launch in (RU, IN, SO) so that we can monitor the situation and get feedback.

Open to suggestions though - I'll leave this issue open in case anyone else would like to comment.

bocops commented 6 years ago

@drinckes Just as an observation, seeing the '9' and '8' characters in the correct place of a potential different-script plus code makes it immediately more trustworthy to me.

I agree that it might be best to use just one character set - but if using localized variants is deemed necessary at some point in the future, mapping individual characters to similar looking ones (like 'M' to 'П', 'N' to 'Й', 'X' to 'Х') might be a good idea.

drinckes commented 6 years ago

@bocops ha so we've already been through a cycle of thinking about cyrillic (several people on the team are familiar with it).

One problem is that looking similar != sounding similar. Cyrillic Р looks like Roman P but is pronounced "er", and that raises the question of do you use it in place of P? Or R? Or neither?

If we can only use symbols that are both visually distinct, and sound different to the roman characters, we may end up without a lot of choices in some scripts.

bocops commented 6 years ago

Right - another problem with that will be that choosing either similar looking or similar sounding characters likely will lead to an unsorted character set. I know that GX00+ is adjacent to HX00+, because 'G' and 'H' are consecutive letters of the latin alphabet. The same will probably not be the case with their replacement characters, however chosen.

fulldecent commented 3 years ago

Recommending to close this issue as out of scope.

I'm considering that breaking changes of this magnitude are out of scope.

bocops commented 2 years ago

There hasn't been any meaningful discussion of this issue in over four years. Meanwhile, the FAQ states that

Q: Why is Open Location Code based on latin characters?

A: We are aware that many of the countries where Open Location Codes will be most useful use non-Latin character sets, such as Arabic, Chinese, Cyrillic, Thai, Vietnamese, etc. We selected Latin characters as the most common second-choice character set in these locations. We considered defining alternative Open Location Code alphabets in each character set, but this would result in codes that would be unusable to visitors to that region.

which means that this actually seems resolved as "feature not planned" by now. My suggestion would be to also add the above to the FAQ in the Wiki, then close this issue.

drinckes commented 5 months ago

Done, updated FAQ.

dpiauxdda commented 5 months ago

Four years, but you have no patience. It takes time to test these things, man.