opengeospatial / geopackage

An asciidoc version of the GeoPackage specification for easier collaboration
https://www.geopackage.org
Other
264 stars 71 forks source link

1:N relationships by embedding User-Defined mapping tables in attribute tables #461

Closed fjlopez closed 5 years ago

fjlopez commented 6 years ago

This proposal is related with #459. 1:N relationships can be implemented easily if the related table is, at the same time, a mapping table. This proposal reduces the number of joins for one to many relationships and improves the performance.

Given the following relationship:

base_table_name base_column related_table_name related_column relation_name mapping_table_name
sample_feature_table id sample_attributes id simple_attributes sample_attributes

It can be implemented as follows:

CREATE TABLE 'sample_feature_table' (
  id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  geometry GEOMETRY,
  text_attribute TEXT,
  real_attribute REAL,
  boolean_attribute BOOLEAN)

CREATE TABLE 'sample_attributes' (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  base_id INTEGER NOT NULL,
  related_id INTEGER NOT NULL,
  label TEXT NOT NULL,
  description TEXT NOT NULL,
  arisings INTEGER NOT NULL)

The field sample_attributes.related_id must contain the same value of sample_attributes.id. This can be done with the support of a trigger of by an external program.

However, I think that if we update the draft to add two additional columns to the gpkgext_relations table that explicitly contains the names of the base_id and related_id in the mapping table. The default values of these columns are "base_id" and "related_id" (this has the advantage of allowing database designers to be more expressive in the name of the columns in mapping tables when the design is complex, e.g. "airport_id", "city_id"). And then only check that the mapping_table_name contains columns with such names, we can implement straightforward 1:N relationships.

The relationship should be:

base_table_name base_column related_table_name related_column relation_name mapping_table_name mapping_table_name_base_column mapping_table_name_related_column
sample_feature_table id sample_attributes id simple_attributes sample_attributes base_id id
CREATE TABLE 'sample_feature_table' (
  id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  geometry GEOMETRY,
  text_attribute TEXT,
  real_attribute REAL,
  boolean_attribute BOOLEAN)

CREATE TABLE 'sample_attributes' (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  base_id INTEGER NOT NULL,
  label TEXT NOT NULL,
  description TEXT NOT NULL,
  arisings INTEGER NOT NULL)
jyutzler commented 6 years ago

This was considered during the Interoperability Experiment, but the group opted not to go that way. See this excerpt from Section 7.3 of the Interoperability Experiment Engineering Report*.

Participants agreed to make the mapping table mandatory as part of this extension. This would create a single code path for the software to follow, joining the base table, the mapping table, and the related data table. While a one-to-many relationship could be handled without this table structure (a foreign key relationship would suffice), this would require a foreign key in the related data table. This would restrict the related data to a single related data relationship.

* This document was approved by publication by OGC as 17-093 a month or so ago, but it has not yet been published on-line.

jyutzler commented 5 years ago

The link for 17-093 is http://docs.opengeospatial.org/per/17-093r1.html.

mcfoi commented 3 months ago

The link for 17-093 is http://docs.opengeospatial.org/per/17-093r1.html.

Dear @jyutzler , after some 6 years, it appears that no clear answer was given to this question. Despite the quoted document clearly reports the outcome of the Experiment, still no clear guidance is available about how to implement One-to-Many relationships.

In GDAL documentation at https://gdal.org/drivers/vector/gpkg.html we can read about Relationships on Geopackage that:

One-to-many relationships will also be reported for tables which utilize FOREIGN KEY constraints.

This does not occur in QGIS 3.36 where constraints created just in SQL are NOT detected as Relationships. A table created with: CREATE TABLE "apartments" ( "id_apartment" INTEGER NOT NULL, "id_building" INTEGER NOT NULL, FOREIGN KEY (id_building) REFERENCES buildings (id_building) ); would not be recognised as linked by a Relationship to the (omitted) "buildings" table.

More, the GeoPackage Related Tables Extension leaves no documentation about how to handle for the One-To-Many use case the fact that 'maping_table_name' is REQUIRED in the 'gpkgext_relations' Table. What should be provided in this field?

Thanks

jyutzler commented 3 months ago

When using the Related Tables Extension, it is valid to create a relationship even when you only have a 1:N relationship. This concept is discussed in Section 7.3 of the Engineering Report. However, there is nothing saying that you must use the extension. In your scenario, the RTE may be overkill and you may find it sufficient to establish a foreign key relationship. There is no right way or wrong way here.

I can't speak for GDAL or QGIS. I am not involved with them in any capacity.

mcfoi commented 3 months ago

Dear @jyutzler, just for final clarification, you are stating that 6 years after Engineering Report publication, the OFFICIAL way to support in One-to-Many (and One-To-One) relationships in GeoPackage,

if adopting Related Tables Extension, is still creating 'pivot' tables containg (at least) 'base_id' and 'related_id' INTEGER columns and recording such 'pivot' table into the 'mapping_table_name' column of the 'gpkgext_relations' Table?

while

if NOT adopting Related Tables Extension, the ONLY other way is to handle GeoPackage as a plain SQLite DB and add FOREIGN KEY clauses in the CREATE TABLE expression of the 'child' Table of a parent-child pair of related tables. (this will go unnoticed in tools like QGIS - as of v. 3.36 - where constraint will just emerge as failures in the form of FK constraint violations while saving edits)

I hope this exchage makes easier for users to understand how to approach relationships in GeoPackage.

jyutzler commented 3 months ago

Dear @jyutzler, just for final clarification, you are stating that 6 years after Engineering Report publication, the OFFICIAL way to support in One-to-Many (and One-To-One) relationships in GeoPackage,

The Standard replaces the Engineering Report.

if adopting Related Tables Extension, is still creating 'pivot' tables containg (at least) 'base_id' and 'related_id' INTEGER columns and recording such 'pivot' table into the 'mapping_table_name' column of the 'gpkgext_relations' Table?

Yes, as described here.

while

if NOT adopting Related Tables Extension, the ONLY other way is to handle GeoPackage as a plain SQLite DB and add FOREIGN KEY clauses in the CREATE TABLE expression of the 'child' Table of a parent-child pair of related tables. (this will go unnoticed in tools like QGIS - as of v. 3.36 - where constraint will just emerge as failures in the form of FK constraint violations while saving edits)

I can add to this that you will want to put your non geospatial data in an Attributes table and join it or create a view from there.