google / open-location-code

Open Location Code is a library to generate short codes, called "plus codes", that can be used as digital addresses where street addresses don't exist.
https://plus.codes
Apache License 2.0
4.06k stars 472 forks source link

Plus Codes regex is incorrect #438

Closed fulldecent closed 3 months ago

fulldecent commented 3 years ago

In https://github.com/google/open-location-code/wiki/Supporting-plus-codes-in-your-app it is stated:

Global codes can be recognised and extracted from a query using a regular expression:

However, a particular global code "15,849VGJQF+VX7QR3J" taken from the test cases does not match that.

fulldecent commented 3 years ago

Recommended tag: documentation

bocops commented 3 years ago

An underlying issue here seems to be that "global code" is not actually a well-defined term in OLC. We have "full codes" and "short codes", but not "global codes" outside of the wiki.

That terminology was more prominently used on plus.codes for a while, basically meaning "full code with either 10 or 11 digits", because those are the most useful ones in everyday use. It no longer is used as much, I could only find one reference to it in the FAQ.

Depending on whether this seems like a useful term to continue using, what should be done is either:

or:

fulldecent commented 3 years ago

Thank you for the explanation.

Additionally, that regex fails to match codes of length less than 8.

I am taking your terminology notes into consideration and incorporating them into a plus code specification and an Open Location specification. These two specification would do well to replace all normative references which are currently spread out across this project in docs, comments, wikis, FAQs, source code comments and undocumented implementation details.

bocops commented 3 years ago

Additionally, that regex fails to match codes of length less than 8.

Whether or not we replace "global code" with "full code" or define that term as specifically meaning a 10/11-digit full code, that regex does not fail to match short codes because that is not the intent in the first place. Not in a section that deals with some variant of unabbreviated codes.

You might actually not be talking about shortening (to less than 8 characters) but about padding (in which case the length is still 8) - but that is opening a whole other can of worms in a context that seems to be all about using full plus codes as addresses or other forms of "point-like" locations. I don't think that detecting padded plus codes is within the scope of that exact regex.

fulldecent commented 3 years ago

That is a fair assessment.

My takeaways here are that

  1. All normative statements in this project need to be moved into an actual specification.
  2. A "global code" or a "default" length should be understood as 10 digits (or possibly 11)
fulldecent commented 3 years ago

This is fixed at https://github.com/google/open-location-code/pull/463

All reference of "Global" codes and other deprecated naming is gone.

All reference to parts of a Plus Code like 2222 are updated to show a valid Plus Code like 22220000+ without breaking the narrative.

Additionally, it references a PCRE that fully implements the isValid, isShort and isFull functions.

drinckes commented 3 months ago

This issue has rapidly gone off topic and I'm going to drag it back to the regex.

The regex on the doc page requires two or three characters after the plus sign, for a maximum length of 11 characters. The specification states 15 gives precisions up to 15 characters so the regex should allow that.