fossgis / osmdata

Scripts for creating OSM data derived data sets
GNU General Public License v3.0
28 stars 5 forks source link

Plata issue may be resolved -- can we have a manual override for the rest of the world? #12

Closed JesseWeinstein closed 4 years ago

JesseWeinstein commented 4 years ago

It appears the Rio de la Plata issue may be moving towards resolution (and, at least for now, it will not contribute to the total size of diffs) -- but over the last six months, enough other (uncontroversial) changes have occurred, that the size is still over the cutoff. Could you do a manual override so these other changes can be applied?

Even if the Plata issue rises up again, it will at least not get conflated with all the other changes.

JesseWeinstein commented 4 years ago

As you can see in the linked ticket, a (maybe the most) prominent downstream user of this data has requested that the problem (the lack of updates for six months) be fixed upstream. As I understand it, that means running the https://github.com/fossgis/osmdata/blob/master/master/release-coastline.sh script.

Could you let me know if that will be done, or if not, what other process you'd prefer to see in order to move forward?

JesseWeinstein commented 4 years ago

As of the July 23 data, the difference is only 36,491 sq km (out of more than 509 million sq km), a ratio of 0.000071650. That is still well over the required ratio (0.0000015) which requires no more than around 765 sq km to have changed.

I'm still working on identifying the minimal set of temporary reversions needed to address those 36k sq km -- but getting a manual reset would save a lot of work identifying them, making the temporary reverts, then slowly restoring them over presumably about the next month.

woodpeck commented 4 years ago

Fiddling with the world-wide OSM coastline hoping to "ease in" changes to circumvent the script's threshold is not the right approach and will get you blocked on OSM. Don't do it.

JesseWeinstein commented 4 years ago

If reducing the magnitude of the changes since Jan 9, 2020 is not the right approach according to you -- what is? And will you be implementing it, @woodpeck ?

JesseWeinstein commented 4 years ago

I suppose another alternative, after identifying a sufficient set of changes since Jan 9 that the remaining changes are less than 765 sq km, would be to document those changes, and their uncontroversial and justified status, on the wiki -- then appeal once again for a manual blessing of them. Is that more what you were thinking of, @woodpeck ?

HolgerJeromin commented 4 years ago

ref #7

joto commented 4 years ago

@JesseWeinstein You don't need to document anything or count square km. Simply mapping the coastline where it actually is, is the right thing to do here, as always in OSM. Once there is a sufficiently well mapped coastline, I'll release it.

JesseWeinstein commented 4 years ago

@joto Thank you for responding, and I'm sorry that none of the efforts I've been trying to make to understand the situation or resolve this problem have been helpful.

As I understand the situation, right now, the July 23rd data did map the coastline "where it actually is", both around Rio de la Plata, and elsewhere in the world. Since that was apparently not "a sufficiently well mapped coastline" -- could you be more specific about what was wrong with it?

joto commented 4 years ago

I have now manually "released" the coastline data due to the bad mess we are in right now. But I still have reservation about its correctness, for instance in northern India and on the US east coast. I don't want to be responsible for keeping the coastine mapped correctly and, apparently, nobody else want to be either.

imagico commented 4 years ago

for instance in northern India

Your probably mean Indus delta and Rann of Kutch.

For reference: https://www.openstreetmap.org/changeset/84286303

JesseWeinstein commented 4 years ago

Thank you so much for doing the manual release -- I am sorry for bugging you so much about it, and I totally sympathize with the frustration of having to act as the final arbiter of coastline correctness.

I suggest the following process:

  1. All changes over ~750 sq km within a day should be presumed incorrect. (This is already the case, as various people have said elsewhere)
  2. If such a change persists for more than one day (i.e. two failed updates in a row), an bunch of automated alerts should be sent, to everywhere from the mailing lists, to IRC, to wherever else we can think of, encouraging people to revert the changes.
  3. If a mapper believes such a large change to be valid (as in the Rio de la Plata case), the proper process (enforced by prompt reverts and blocking if necessary) is to create a wiki page (probably using the formal proposal process, or something similar), demonstrate consensus for the change (over at least a month, if not longer), get your explicit acknowledgement that you recognize that consensus, and only then apply the change (which you will promptly manually release).

This allows you to avoid being a single point of judgement (except on whether sufficient consensus has been demonstrated), and avoids such long backlogs of changes as we've seen recently. If this seems worth pursuing, I can write it up elsewhere.