opengeospatial / geopackage

An asciidoc version of the GeoPackage specification for easier collaboration
https://www.geopackage.org
Other
264 stars 71 forks source link

Zipped GeoPackage files and media type #555

Open heidivanparys opened 4 years ago

heidivanparys commented 4 years ago

Is it common to zip GeoPackage files? If yes, should a media type, application/geopackage+sqlite3+zip, be registered for that at IANA?

In that way, an API conforming to OGC API - Features and the (draft) INSPIRE good practice building on OGC API - Features could link to such a zipped GeoPackage file using that media type.

"links": [
  { ... },
  { "href": "https://download.my-org.eu/buildings.zip",
    "rel": "enclosure",
    "type": "application/geopackage+sqlite3+zip",
    "title": "Download the dataset as a GeoPackage (CRS: EPSG:25832)",
    "length": 472546 },
  { ... }
  ],

See also

jerstlouis commented 3 years ago

@heidivanparys Since HTTP already supports Content-Encoding as a mechanism to exchange the data compressed, would there still be enough value in a dedicated application/geopackage+sqlite3+zip media type? The same +zip question kind of applies to all formats that an OGC API might deliver.

Maybe there is still value for very large file so that they can get saved directly with the .zip extension, and to save the server from having to compress it on the fly in some cases?

jyutzler commented 3 years ago

@heidivanparys any thoughts on @jerstlouis 's comment?

fjlopez commented 3 years ago

I have mixed thoughts on this. It makes sense because lots of +zip media types are registered at IANA but, at the same time, application/vnd.sqlite3+zip is not registered. Is it common to distribute sqlite3 files compressed?

heidivanparys commented 3 years ago

Since HTTP already supports Content-Encoding as a mechanism to exchange the data compressed, would there still be enough value in a dedicated application/geopackage+sqlite3+zip media type? The same +zip question kind of applies to all formats that an OGC API might deliver.

Maybe there is still value for very large file so that they can get saved directly with the .zip extension, and to save the server from having to compress it on the fly in some cases?

Is it common to distribute sqlite3 files compressed?

@jerstlouis @fjlopez I don't know what is common practice, but I can describe the practice at the agency where I work. One of our distribution channels is the Danish Map Supply. One of the ways you can get data from the Danish Map Supply is by downloading a dataset or a predefined subset of a dataset from the Map Supply's FTP server.

A host of predefined sections of data sets are readily available for download. These are both sections of historical data sets and sections from updated data sets that are updated regularly to reflect the newest available data. E.g. the matricular maps are updated every two months.

The FTP server stores (subsets of) datasets in different format. I had a look again, and almost all files are zipped. So the shapefiles, GML files, MapInfo files, etc. are compressed and then put on the FTP server, from where users can retrieve those zip files.

Links to those zip files, and information about their media types, are e.g. present in the Atom feeds we have as well, see e.g. https://download.kortforsyningen.dk/sites/default/files/feeds/NamedPlace.xml:

<entry xml:lang="da">
    <title>DK INSPIRE NamedPlace</title>
    <!-- ... -->
    <link
        rel="alternate"
        href="ftp://ftp.kortforsyningen.dk/atomfeeds/INSPIRE/GML/EPSG_3044/DK_NamedPlace.gml.gz"
        type="application/x-gmz"
        length="109479325"
        title="DK INSPIRE NamedPlace"
        hreflang="da"/>
    <!-- ... -->
    <id>ftp://ftp.kortforsyningen.dk/atomfeeds/INSPIRE/GML/EPSG_3044/DK_NamedPlace.gml.gz</id>
    <!-- ... -->
  </entry>

(Media type application/x-gmz is described on https://inspire.ec.europa.eu/media-types/application/x-gmz).

I have mixed thoughts on this. It makes sense because lots of +zip media types are registered at IANA but, at the same time, application/vnd.sqlite3+zip is not registered. Is it common to distribute sqlite3 files compressed?

I am not convinced that we can conclude that it is not common to distribute sqlite3 files compressed just because application/vnd.sqlite3+zip is not registered. Another explanation could be that nobody cared to register application/vnd.sqlite3+zip because there is no need to comply with a certain specification or best practice.

fjlopez commented 3 years ago

IMHO, The discussion on the distribution of GeoPackage as compressed files and the need for the registry of an IANA media type for such case should not be mixed:

RFC 6839 may explain why application/vnd.sqlite3+zip has not been registered.

heidivanparys commented 3 years ago

I think that there is no need to register a specific media type because RFC 6839 3.6. The +zip structured syntax suffix defines when and how to use of +zip and hence application/geopackage+sqlite3+zip is OK as application/geopackage+sqlite3 is already registered.

Earlier, I made the same assumption. However, in another, similar discussion, on zipped GeoJSon files (the relevant part starting here), @cportele wrote the following in this comment:

[...] In my understanding 6839 states rules for media types with a suffix like "+zip". It does not say a suffix "+zip" may be added to any existing media type. Something like application/geo+json+zip would not be a valid media type. It would still need to be registered with IANA. [...]

fjlopez commented 3 years ago

I agree with you, my assumption was wrong. See this excerpt from RFC 6898 Media Type Specifications and Registration Procedures.

Media types that make use of a named structured syntax SHOULD use the appropriate registered "+suffix" for that structured syntax when they are registered.

Reviewing the IANA registry of structured suffixes +gzip is also registered. But it makes sense to register only application/geopackager+sqlite3+zip due to the popularity and availability of the ZIP format.

jyutzler commented 3 years ago

So the consensus is to ask IANA to register application/geopackage+sqlite3+zip? I just want to be sure before I move forward.

jerstlouis commented 3 years ago

Also please keep in mind that even if the encoding is application/geopackage+sqlite3, it is still possible for the data to be compressed zipped with Accept-encoding, and unless the visualization client directly supports zipped GeoPackage, this avoids an extra step / duplication of the data compared to having to extract it.

fjlopez commented 3 years ago

@jerstlouis there are scenarios where having +zip is needed. For example, we can have links to GeoPackages in an Atom file that point to:

jyutzler commented 3 years ago

Assigning to @ogcscotts to contact IANA.