google / open-location-code

Open Location Code is a library to generate short codes, called "plus codes", that can be used as digital addresses where street addresses don't exist.
https://plus.codes
Apache License 2.0
4.06k stars 471 forks source link

Update specification #463

Closed fulldecent closed 2 months ago

fulldecent commented 3 years ago

This PR creates implementable specifications for Plus Codes and Open Location Codes with minimal changes to what is currently published.

And it removes all contradictory information in this project.

In the process, this solves a majority of open issues in one pull request.

Because of contradictions in the project as it, it is NOT possible to make this into bite-sized PRs. I appreciate your time reviewing this thoroughly.

Normative changes

Other chages

Fixed issues

Makes progress on these issues

:star: Requested action

Follow-on work

After this PR is merged, I can:

And I can help but might need help with:

fulldecent commented 3 years ago

This PR includes a move of all implementations into the Implementations/ folder and this inflates to a scary high Files Changed number. Please review this PR using VS code and you will see there are fewer actual changes. This file moving is required for this PR because it distinguishes the "implementations" from the "demos". And implementations now have a full specification.

I have worked hard to make this a "bite sized" and "minimal" PR.

fulldecent commented 3 years ago

Requesting PR review, please, from @bilst, @bocops, @drinckes


Specifically, requesting help to check:

(Update on 2021-06-07, this ^^ specific item is fixed at 2ce7dca3a8281774aa46c4bb0034bd8a0b94db00 and confirmed to match behavior of existing C implementation.)

bocops commented 3 years ago

I haven't had time to check all of this in detail. The fact that this is a massive change that also moves around unrelated parts makes it hard to do so - and considering that even much smaller PR currently aren't accepted quickly makes me wonder if this repo and its maintainers and contributors are even willing and able to handle a PR of this scope at the moment.

For what it's worth, reading through the rewritten Plus Codes Specification.md, the whole thing feels much less readable and understandable to me. While more strict specifications are appreciated in some places, what this now lacks is a good overview of what a plus code even is, what it looks like, what it can be used for, without immediately jumping into "SHALL NOT", "MUST", and "MAY" legalese.

On another note, the rewritten specs claim to be v1.0.5, a patch-level bump from the previous 1.0.4, but I immediately see some areas where there are larger (if not even breaking) changes to the specification. This means that this new specification, if accepted, should be 1.1.0 or even 2.0.0, which in turn means that there would need to be an even more thorough look at it (and reason for the change in the first place).

To give just one example for a potentially breaking change, areas around the poles are now strictly defined to include or not include the poles themselves. In the case of the south pole, where before all areas having the pole as one of their corner points would include it, now only one of them (for any given code length) does. On top of that, the area that does is the one east of the antimeridian, not the one east of the prime meridian, although the latter ("0°") is typically used when giving coordinates of the poles. While this might look like a rather unimportant detail, it can make the resulting code worse than strictly necessary, as has been laid out in #450.

fulldecent commented 3 years ago

👍 Thank you for looking at my PR, I do really appreciate it.

Poles

I agree that all details here are important and I commit to fix everything to earn acceptance of this PR. Yes, prime meridian is a more customary choice.

✅ Fixed at 8345cf4234d26ffedcfa3201e541875ed8181bf8

Introduction and readability

The current Plus Codes specification also does not include an introduction.

And this PR does include the same introduction file as before.

Is this the file you were looking for?

Here is a potential brief addition at the top of the Plus Codes specification (above the version history):

In much of the world, street addresses are poorly defined or non-existent. Plus Codes allow to encode location information into easily exchangeable codes. For example, 8FVC2200+ includes the area around Romoos, Switzerland. Shorter and longer codes can represent various area sizes, are helpful for navigation and are very convenient for humans to remember and use.

❓ Shall I make that addition?

Massive changes

The PR currently moves all implementations into an Implementations/ folder. This is in contrast to Demos which go to the Demos/ folder. This is significant because the Open Location API Specification applies to "implementations".

❓ Shall I factor out this change to discuss separately?

Version number

You are referring to Semantic Versioning. Thank you, I agree 1.1.0, at a minimum, is appropriate.

🦶 Updated at d7a7a91540e3da9911717a21e980bc57bfe9580b.

❓ Shall I change this 2.0.0?

Corners and edges

I have corrected the specification to include the south-west corner of each area. I tested the C implementation works this way and also added test cases.

✅ Corrected in 2ce7dca3a8281774aa46c4bb0034bd8a0b94db00

Other notes

I am not deterred that other PRs are not accepted quickly.

This project "strongly encourages technical contributions" and I hope this significant technical PR, pushed by somebody committed to see it through, and flexible to get it done, can see the light of day. I am also around to take on more responsibility later when I know good PRs are actively reviewed and acted on. (Specifically, I'd like to update one of the implementations to be a "reference implementation" as a great starting example.)

bocops commented 3 years ago

Re: Introduction/Readability ("Shall I make that addition?"): Perhaps I've just been confused by the amount of change. Let's revisit this at a later point, once all the other pieces have fallen into place, wherever that might be?

Re: Massive changes ("Shall I factor out this change to discuss separately?"): I actually appreciate the enhanced structure of having all implementations but not the random rest in one folder, but I think this might best be handled separately, if only to remove some of the noise from this PR.

Re: Version number: You're right, this repo doesn't explicitly state that it does use semantic versioning, but I think that is the reasonable choice. Whether a minor version bump would be sufficient in my opinion depends (probably among other things) on whether decoding a plus code encoded with 1.0.4 using these new specifications still leads to the same coordinates in all instances. Perhaps let's identify those changes first that would mandate a major version bump, and see which ones are considered absolutely necessary?

fulldecent commented 3 years ago

Massive changes

All file moves were extracted from this PR to discuss another day. (Separate discussion referenced in Follow-On Work on top post.)

✅ Fixed in aef4682c5e50b4ddf0f78d0a49facd8474d7d8b4

Version number

I believe the only normative note is that north and south poles are defined in PR to have a specific Plus Code at each level.

The current spec canon (i.e. wiki, "spec", and other documents in this repo that make statements) refer to the poles as:

Every other input and output is intended to be the same in this PR. And I have tested the C implementation for conformance for hand-checked inputs and it passed. (Commit at 2ce7dca3a8281774aa46c4bb0034bd8a0b94db00.)

Depending on your definition of "change" this PR represents either a "change" for one location and a new specification for the other; or it represents a "don't care" for one location and a "resolving conflict" for the other.

Next is a more philosophical note, maybe. The proposed spec uses math to define Plus Codes and warns about math bit precision. And the current spec canon uses a pseudocode as a spec. The current spec canon will produce different results on 64-, 32- and 16-bit (think embedded) systems. If this causes the PR to produce different results (on 32-, 16-bit systems) than the current spec canon, I consider it a feature not a bug.

bocops commented 3 years ago

I believe the only normative note is that north and south poles are defined in PR to have a specific Plus Code at each level. [...] Depending on your definition of "change" this PR represents either a "change" for one location and a new specification for the other; or it represents a "don't care" for one location and a "resolving conflict" for the other.

As far as the poles are concerned, I admit that this is a somewhat obscure detail - but as this specification rewrite is all about being precise and strict, we shouldn't be too handwavy about the details. As has already been stated in #450, currently someone might want to encode "the south pole as well as nearby locations along the anti-meridian", and end up with a plus code of 22222222+. Decoding this plus code under 1.0.4, the result will be an area that includes the south pole. Unter this new specification, it explicitly does not.

Now, I'm still not sure whether having a perfect tiling in the mathematical sense is important enough to go through all this - but if we do, we should be clear about the fact that at least one plus code decodes to different things under 1.0.4 specs and new specs.

Of course, there might also be other details outside just encoding/decoding that lead to certain version bumps, like the actually required API methods, or what regex is being suggested as matching valid codes.

fulldecent commented 3 years ago

Thank you for all the notes. Below is a summary of (recently) resolved and (every) remaining items.

Poles and breaking changes

Breaking change warning added.

:white_check_mark: Fixed at ffe0a3a4f4da2a15aa4b2fef72e7855d23a547be

Required API methods

Also I moved isValid, which regards full+short codes from required to optional. This is consistent with all other things relating to short codes being optional. Separately, no implementation implements isValid currently as specified (see below).

Breaking change warning added.

:white_check_mark: Fixed at ce6bb6db03524857f7b3762b5c6d3236b02428b3

Replaced specifications:

Regex

Agreed, there is a slight difference in the isFull, isShort, isValid methods. Now they actually must validate the Plus Codes. This is easy because a PCRE implements this exactly.

No implementations implement the current required API. Specifically, they all check for the format separator (+) which is not part of the spec.

The old regexes were wrong and/or conflicting and removed.

The new Regexes exactly match valid Plus Codes, the spec, and the plain English meaning of the method name "is valid".

Breaking change warning added.

:white_check_mark: Fixed at daa1c353daef1818f10e797e22766e62cb21fd9c

Copyright

Thank you! Deleted.

:white_check_mark: Fixed at d7c6d2507ce49c2a0ef0d7c1d5249ef4b668db37

Code length lookup

:white_check_mark: Fixed at a2b4168

"Precision"

A more specific wording describes the geometry of the Plus Code area rather than the unclear word "precision".

:white_check_mark: Fixed at 33044f51f05234c96da1600ed24db9b5f46100a7

Earth

Normative reference to Earth is removed.

✅ 79ca2c2c94a42f7a44edeb0e52fc5e9524aac846

Open items:

fulldecent commented 3 years ago

@bocops Could you please share any thoughts on the open items immediately above?

bocops commented 3 years ago

Sorry for the late reply. :)

Regarding required vs. optional methods, I think we agree that all current functionality regarding the handling of full codes needs to stay required. According to current API.txt, that is: isFull, encode, decode

I'm not sure that making all functionality related to short codes completely optional is the correct thing to do. At the very least, contributors should be strongly encouraged to offer this functionality in their implementations. Whether we decide for or against this change, this would affect these methods: isShort, shorten, recoverNearest

The remaining isValid is a special case. Even if we decide to make short code handling optional, a short code is still a valid code, so if this method name stays, it should behave the same across all implementations, independent of whether a certain implementation does or doesn't handle short codes. At that point, discussing the removal of shorten-related functionality becomes somewhat moot.

In the end, because of this, and because the main problem with short codes seems to be not their existence but the fact that the main publisher of short codes chooses reference locations rather opaquely, I believe that all seven methods might as well stay required.

Regarding the question of required code length support, I assume this means full code lengths outside of shortening? A while ago, we had a lengthy discussion about what code lengths are sensible in the first place, and the decision was made to restrict code length to <=15, because anything beyond that doesn't really have any real world use cases.

An argument could be made that codes referencing an area of <= 10cm² (code length 14, 15) aren't that sensible in practice, either - but unless those fall into the range of "technically impossible on specific systems", I don't think we should make some lengths required but others optional. If there are precision problems, we should rather define how an implementation should deal with them - for example, should decoding a "too long" code simply fail, or should it return the next-bigger area?

Regarding safety margins when shortening, I think the best description so far can be found on the wiki page Guidance for shortening codes, although I'd prefer to not simply hand out magic numbers like 0.4, but explain a bit better why we're not going for the theoretical full range of 0.5.

fulldecent commented 2 years ago

I am considering to update the PR and make it:

If I implement all these things, does that clear the threshold for merging this PR?

fulldecent commented 2 years ago

@bocops @drinckes Can you please confirm that if I make the changes above https://github.com/google/open-location-code/pull/463#issuecomment-945004628 and create a reference implementation that this PR may be merged?

bocops commented 2 years ago

@fulldecent This is a decision for @drinckes or potentially @bilst to make. I'm still not convinced that all of the breaking changes are really necessary, but in the end this is their project. Either way, I agree it would be nice to see some closure here, this has gone on long enough.

fulldecent commented 2 years ago

@drinckes @bilst, please advise.

I'd like to update these three things if you think it will be mergable.

After merged I will work on a reference implementation.

bilst commented 2 years ago

Sorry for the slow reply! I think you point out some interesting theoretically valid technical issues, but given that they haven't been an actual problem for any users, I'm not sure it's worth the added complexity/formality to call them out specifically.

Easiest PRs to accept are small scope improvements to clarity (e.g. editing a paragraph or two in a single file). Any increased complexity of documentation or added requirements need to be weighed carefully against their potential to discourage adoption by GIS folks who might not be as comfortable working with more formal specifications.

fulldecent commented 2 years ago

@bilst

issues... haven't been an actual problem for any users

Instead of considering the problems of users (which is super limited, because few people use Plus Codes), please consider the problems of potential users.

There are many people that don't use Plus Codes. Perhaps this is because it has problems and it is not possible to correctly implement Plus Codes.

People that develop products (even GIS folks) like correct specifications and example code. On the other hand, people that start developing products, but never finish... those are the people that like vague and inconsistent specifications.

This PR addresses these issues, holds Plus Codes in the highest light, and lays the path forward for multiple compatible and correct implementations.


I will not break this PR up into smaller pieces because too many things are broken as-is.

You are welcome to close this PR without merging if you still do not have the time to provide a proper review.

drinckes commented 2 months ago

I'm not even going to try to understand a single PR of this size.

For future reference, I would rather have multiple (i.e. small) PRs for a single issue, than one PR that attempts to address multiple issues.