Vote Comment - 1.10 Media Type and File Extension

chris-little commented 2 years ago

Part of First comment submitted in the Technical Committee vote to commence Work: It seems that “The only profile URI defined in CovJSON specification is https://covjson.org/def/core#standalone which asserts that all domain and range objects are directly embedded in a CoverageJSON document and not referenced by URLs”. For voluminous grids, it seems that either CovJSON is inappropriate (a big raster in JSON is probably too big), or there is a need for a non-standalone profile to be added to this draft v0.2. Clarify whether external reference to grids in other formats than JSON is within the scope of the specification. If it is not, it would be a strong limitation, to be indicated as early as in the Scope.

jonblower commented 2 years ago

The purpose of the #standalone profile is to assert that a particular CovJSON document is self-contained (no external references to separate documents). It's an optional profile to be used only in that circumstance. The "non-standalone" profile is simply the default state of CovJSON, and no explicit profile URI is needed in this case.

By the way, there seems to be a recurring claim that it's not possible to exchange large rasters in JSON, but there are several strategies to make this feasible:

Controlling the number of significant figures used
Using on-the-wire compression
Splitting large rasters among several documents (e.g. in a TiledNdArray)

Experiments show that fairly large rasters can still be exchanged efficiently in JSON in this way, but we would need to define what is meant by "large". I think we need to push back on vague comments like "a big raster in JSON is probably too big".

It is certainly true that JSON is not seekable, and this is often the real key limitation in using it for raster data, rather than its size per se (because it's hard to slice out subsets of the JSON raster without reading the whole thing in).

It is also certainly true that CovJSON is not intended for use as an archival format for very large datasets - it's an exchange format for the Web.

chris-little commented 2 years ago

Both the Abstract and Scope now indicate that primary use cases are for transferring relatively small amounts of data to browsers or mobile devices, and that CoverageJSON is not particuarly suited to bulk data transfer.

chris-little commented 2 years ago

@jonblower Can you please clarify something that arose in today's meeting: can you confirm for @jerstlouis and @chris-little that the default mode of CoverageJSON is to embed the RangeSet / data values in the JSON object rather than point/link to the data in another place / object/ file? Can CoverageJSON support linked data values as opposed to linked metadata in other places?

jonblower commented 2 years ago

Strictly there isn't a "default mode" of CoverageJSON, but options are provided:

The simplest and (by far) the most commonly-used case to embed the data values (of all variables) in the Coverage object, so it is completely standalone.
The next-simplest case is to put data values for each variable (parameter) in separate NdArray documents, linked from the Coverage object. This is most useful in a multi-variable dataset, where you might want temperature, humidity, wind speed etc to be recorded in separate files, so the user only has to load the variables that they are interested in.
The most complicated case is to use tiling (TiledNdArray), where the data values are partitioned spatially and temporally, so a single variable's data values would be split among several documents. The simplest example of this would be to encode each timestep in a separate file, but the tiles could also be divided spatially, like a tiled map server.

In the simplest case, a data producer could optionally state that they are using the standalone profile of CovJSON, which gives a cue to clients to not bother looking for links. I could imagine that someone might write a CovJSON client that can only handle standalone documents (because it's much simpler to write it that way).

@chris-little @jerstlouis do you think it's worth putting the above information somewhere more prominent in the spec or scope? It's a frequently asked question. Maybe we need a FAQ?

chris-little commented 2 years ago

@jonblower I think it a good idea to put in the spec chapter/clause. I suggest it goes near the end, after all the detailed definitions. Perhaps a section labelled Some examples of common usage patterns or similar?

chris-little commented 2 years ago

I propose to close this issue at meeting on 2022-04-27 as addressed by the above explanation and improved text in the specification.

opengeospatial / CoverageJSON

Vote Comment - 1.10 Media Type and File Extension #36