opengeospatial / geopackage

An asciidoc version of the GeoPackage specification for easier collaboration
https://www.geopackage.org
Other
264 stars 71 forks source link

Views for feature tables need to be clarified #446

Closed jyutzler closed 4 years ago

jyutzler commented 6 years ago

We say we allow views on feature tables, but it isn't entirely clear how that should be done or how it can be tested. The concept of primary keys does not exist for views. We can guess that the first column of a view is the key, but there is nothing in the standard indicating that this must be done.

Proposal: Update R29 to indicate that tables SHALL have a primary key and that views shall have an integer column as its first column that effectively acts as a primary key (all of its values are unique). This will be testable and it should not affect interoperability because the test would only fail if someone used a view and didn't have a PK-like column as its first column.

The same issue applies to tiles (see R54 and attributes (see R119). R119 needs to be updated to have the "consistent with" language used in R29/R54.

jyutzler commented 6 years ago

This relates to #207.

MarcusMapMaker commented 6 years ago

Comment and suggestion from the Australian Bureau of Statistics.

Hi Jeff,

First, thanks for progressing this issue. The creation of views that GIS applications can easily use, will remove the need to duplicate geometries within the GeoPackages we publish. This will hopefully help everyone who has to store lots of attributes against a single set of geometries e.g. most statistical agencies.

The approach of stating that "tables SHALL have a primary key and that views shall have an integer column as its first column that effectively acts as a primary key" is a practical solution that would minimize disruption and we would support this change.

Not trying to confuse the issue, but one alternative idea we had was to add a reference to the "pseudo primary key in the view" with a new field in the gpkg_contents table. This would provide an explicit location to store and check for and find this information. But I admit, this approach would require a change to the schema which may cause disruption.

This is just and idea and we are happy with the less explicit approach.

cheers,

Marcus Assistant Director (Data Management), Geospatial Solutions, Australian Bureau of Statistics

jyutzler commented 6 years ago

@MarcusMapMaker, We'll keep this in mind and discuss it on July 16th. I think if I had it to do over again, I would add a PK column in gpkg_contents. We ended up doing this with the related tables extension. It may very well be too late to add this now. We'll see what the others say.

davidstrategicaci commented 6 years ago

Super late to the party but i dont concur with constraining a view to have a PK. The intent of a SQL view is to provide a filtered or calculated look at the data within the database. If i don't want to see a primary key, i shouldn't have to. The originating data still meets the PK condition in its originating table. This especially becomes cumbersome when performing aggregate functions or Joins because I essentially have to generate a new key for every row of data to meet compliance.

Additionally, As it pertains to R119, I dont believe setting a PRIMARY KEY, AUTO INCREMENT is allowed in a view, because it is only a reflection of existing data. You cant directly INSERT INTO a view.

As far as testing a view, the applicable constraints are

bradh commented 5 years ago

I worry that clients will assume that they can update entries (anything in a feature table listed in gpkg_contents) in the geopackage. If we're going to allow those to be views, producers need to be really clever with triggers to update the underlying table(s).

jyutzler commented 5 years ago

For future reference: https://gis.stackexchange.com/questions/315547/is-a-pk-required-for-geopackage-vector-layers

jyutzler commented 5 years ago

Closed by #449

To summarize what we did here, the point is to have a PK-like column on each table. If you have a view, you can't (by SQLite definition) have a PK so you have to have a column with unique integers.

As for updateable views, we're not going to advertise or advocate for them. You're kind of on your own.

jyutzler commented 4 years ago

Reopening this issue because of feedback we received during the open comment period. The recommendation is to split the requirements for tables and views so that each are adjudicated independently. This is necessary because of their differing rules. I will prepare a pull request.

jyutzler commented 4 years ago

During today's SWG meeting, the SWG passed a motion to merge #534 contingent on concurrence from other participants in the discussion. The pull request separates the requirements for tables and views and eliminates the loophole that was inadvertently introduced earlier.

In terms of procedure, if you have any objections to the proposed approach, please voice them by COB July 20.