whosonfirst-data / whosonfirst-data

Who's On First is a gazetteer of places.
http://www.whosonfirst.org/
Other
410 stars 9 forks source link

Add new wof:geom_alt property #1793

Closed stepps00 closed 4 years ago

stepps00 commented 4 years ago

WOF records have various properties referencing their respective alt files, including:

There is currently no property that lists all alt filenames in a record's directory. Such a property would be helpful when loading records in tools like the Who's On First Editor.

For the locality of San Francisco, the proposed property would look like:

"wof:geom_alt":[
  "85922583-alt-mapzen-land.geojson",
  "85922583-alt-mapzen.geojson",
  "85922583-alt-quattroshapes_pg.geojson"
]

These values are based on all existing alt files in the directory (does not include itself).

cc @nvkelso @thisisaaronland for thoughts

nvkelso commented 4 years ago

@thisisaaronland @iandees looks like we settled on including .geojson in revgeo for some reason... but by the time these are imported into a database it seems odd to include the file extensions to me.

missinglink commented 4 years ago

We could also omit the 85922583-alt- prefix as this is a convention which is easily generated from a pattern in code-land?

FWIW I'm fine with either, it's trivial for a consumer to split the filename from the extension and it's robust in that the code would work equally well if it were included or omitted.

The extension is also a implicit convention, all features in WOF are geojson and unless you're planning on changing that it's probably not required?

thisisaaronland commented 4 years ago

Suffixes (.geojson) should not be included. Nor should the ID-alt- prefixes. The convention we settled on is:

{SOURCE}-{FUNCTION}-{EXTRAS}

Where:

Related: https://github.com/whosonfirst/go-whosonfirst-uri/blob/master/uri_test.go

nvkelso commented 4 years ago

Just to confirm, it'll read:

"wof:geom_alt":[
  "mapzen-land",
  "mapzen",
  "quattroshapes_pg"
]

And we'll update the other ones (revgeo et al) to follow the same format?

iandees commented 4 years ago

I don't think that having a list of documents to look for alternate geometries in will make it easier to build WOF Editor. It might actually make it harder, because you'd still have to figure out where the documents are and also update this list in the primary place document.

A consistent and documented method for discovering the alternate geometry GeoJSON documents based on the data in the primary document would be better. Additionally, a description of the expected business logic for these alternate geometry documents would be helpful, too. For example, I didn't know that if someone edits a default geometry with the WOF Editor it would need to be stored as an alternate geometry.

missinglink commented 4 years ago

Also worth mentioning that the current SQLite schema uses WOFID as the primary key and so doesn't support alt geoms, but it certainly could by adding an additional table.

thisisaaronland commented 4 years ago

@missinglink https://github.com/whosonfirst/go-whosonfirst-sqlite-features/blob/master/tables/geojson.go#L100

thisisaaronland commented 4 years ago

@iandees "A consistent and documented method for discovering the alternate geometry GeoJSON documents based on the data in the primary document would be better."

Perhaps I am misunderstanding but that it is the function of the wof:geom_alt property.

I am not sure what you mean an "expected business logic" for alternate geometries. There is no expected business logic, per se. That is left up to consumers. The point is to be able to:

missinglink commented 4 years ago

For consumers it's important that there is referential integrity within the features.

So if a feature references something else it would be nice if tooling was able to check that reference is valid rather than passing the check/discovery logic on to consumers to implement.

thisisaaronland commented 4 years ago

@nvkelso Yes, that looks correct assuming the corresponding filenames are:

iandees commented 4 years ago

The wof:geom_alt is a list of alternate geometry document names, it doesn't specify which alternate geometry are in those documents (except, as indicated in your comment above, by convention in filename). As a consumer/editor of the data, should I go download those referred documents, parse them, and read the properties to understand what type of geometry they are?

By "business logic", I'm referring to the example I gave: if the WOF Editor modifies the default geometry for a place, Nathaniel said that we'd want to store that geometry as an alternate of some sort. More generally: when reviewing a pull request for a modification to a WOF place's geometry, what should/would a reviewer be checking for?

thisisaaronland commented 4 years ago

The convention in WOF is that the "source" for an alternate geometry maps to something defined in whosonfirst-sources. For example:

Function and extras have not been formalized in a similar manner, meaning there is loose convention but no validation tools. For example:

https://github.com/whosonfirst/whosonfirst-cookbook/blob/master/how_to/creating_alt_geometries.md

From a tooling perspective, the short answer to your question would be: Yes.

The details, nuance and semantics of a "first class" label (for example, source) are stored externally from the WOF document itself and can/should be handled by the tooling.

If that means creating new tables/code for functions akin to sources and placetypes then I guess it's time to do so. Until today there hasn't been a need.

I don't have the whole context for your conversation with @nvkelso but the approach has been:

I will defer to @stepps00 and @nvkelso for the definition of "substantially changed". Generally we have tried to not alter third-party geometries so if, say, the Natural Earth geometry was the default but is updated because [reasons] then the original NE geom is preserved as an alternate geometry.

iandees commented 4 years ago

Generally we have tried to not alter third-party geometries so if, say, the Natural Earth geometry was the default but is updated because [reasons] then the original NE geom is preserved as an alternate geometry.

Thanks, this is helpful :+1:

nvkelso commented 4 years ago

The most common case is we have a default geometry and no alternate geometries and we need to fix the default geometry. So we take the existing default and move it to an alternate, and note the source for that geometry. Then we set a new default geometry and note it's source is wof (unless we're importing data from a new provider and we'd instead note the provider's name).

If that means creating new tables/code for functions akin to sources and placetypes then I guess it's time to do so. Until today there hasn't been a need.

I think the need is still in the future, let's just solve for today's problem of dealing with the basic alternate geometries that already exist ;)

stepps00 commented 4 years ago

Completed through the above linked PRs ^