w3c / dxwg

Data Catalog Vocabulary (DCAT)
https://w3c.github.io/dxwg/dcat/
Other
153 stars 47 forks source link

Use Case: Negotiation by Coordinate Reference System #311

Closed rob-metalinkage closed 5 years ago

rob-metalinkage commented 6 years ago

If you are submissing a new USE CASE, please use the template below. Otherwise, delete the use case template before submitting your contribution.


Specifying Coordinate Reference System via Profiles and Negotiation

Status: Proposed

Identifier:

Creator: Rob Atkinson

Deliverable(s): ( AP Guidelines, Content Negotiation)

Stakeholders

This Use Case has two possible points of view:

  1. A community of practice (possibly represented by a standards organisation) that wished to define a negotiable aspect, and its behavior in relation to other aspects, particularly profiles which may constrain many aspects.
  2. Client/server negotiation where the expected behavior must be explicit when multiple aspects are negotiated simultaneously

Problem statement

Data may be distributed in a way that supports options on multiple aspects - HTTP provides for language and encoding (Content-type). A mechanism for defining a more general interoperability specification - known as a "profile" is being considered. Other communities of practice may need to define other aspects, and a case in point is an active discussion about negotiation of Coordinate Reference System (CRS) (i.e map projection) for spatial data. This is an example of an orthogonal concern, that may be constrained by profiles.

GeoJson for example is a specification that defines a specific CRS. Other usages typically enforce that a given CRS is available as an option, or may specify the default CRS if the client and server do not negotiate another.

As an example of interchange of spatial data on the Web, the MapML specification under development provides a spatial representation of metadata about data to be included in a typically cartographic map. A need has been identified to allow clients to access the same content using different "Coordinate Reference Systems (CRS)".

Generally speaking, the software provisioning a dataset will determine a range of transformational capabilities independent of the dataset (and any interoperability profile the data itself conforms to) - encoding, CRS, precision, and possibly even language. Thus dimensions of the data may be constrained by profiles, but additional options left to service based distributions.

This Use Case identifies key actors - a "standards body" developing a definition of a data aspect, such as CSR and the mechanisms to identify it, and the client and server performing negotiation in the case where profiles may constrain specific negotiable aspects of the data.

It is assumed therefore that a profile negotiation mechanism specifies behaviour when profiles and other negotiated aspects are used simultaneously, and this behaviour is understood and consistent with the specification of behaviour for negotiation of each defined data aspect.

For example, if a client overrides aspects explicitly declared in a profile, it is responsible to ensure that any data is not declared to conform to that profile.

Existing approaches

Links

https://github.com/w3c/sdw/issues/1058

Requirements

  1. Profile negotiation mechanisms must support or safely co-exist with negotiation over other dimensions of data organisation.
  2. Where conformance to a profile is declared, and that profile specifies constraints over aspects of a data distribution, data distributions, including services must provide support for those constraints as minimum set of negotiable options.
  3. Where conformance to a profile is declared, and that profile specifies constraints over aspects of a data distribution, then any negotiable aspects of the distribution that are not specified must conform to the profile constraints.
  4. Where negotiation options are provided by a service and specified by a client these take precedence over constraints in the default profile for these aspects, however all other constraints in the profile must be met.

Related use cases

Comments

Raised by @larsgsvensson in Profile Negotiation sub-group - and we felt that this requirement was not captured and a Use Case was required.


dvh commented 6 years ago

Is it already decided that CRS negitiation should be done using profiles? Personally, I think the idea of profiles is way too flexible and thus complex for the average usecase where I just want to indicate my preferred CRS. IMHO, CRS is comparible with 'language'.

rob-metalinkage commented 6 years ago

No.. the requirement is that negotiation by Crs is orthogonal and can coexist with negotiation by profile...

On Fri, 3 Aug 2018, 17:46 Dimitri van Hees notifications@github.com wrote:

Is it already decided that CRS negitiation should be done using profiles? Personally, I think the idea of profiles is way too flexible and thus complex for the average usecase where I just want to indicate my preferred CRS. IMHO, CRS is comparible with 'language'.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/w3c/dxwg/issues/311#issuecomment-410173877, or mute the thread https://github.com/notifications/unsubscribe-auth/AIR3Yes_j3za1zfXz8JDkXZBEp5iAgAqks5uM__ugaJpZM4Vrl2M .

agreiner commented 6 years ago

Isn't everything handled by content negotiation orthogonal by definition? There is no capability for reasoning about combinations of q values.

dr-shorthair commented 6 years ago

@dvh I agree that "CRS is comparable with 'language'". And while CRS might be managed as just another (orthogonal) facet for conneg, it is more esoteric than the ones already on the table.

Adding CRS to the conneg params is more in scope for the SDWIG than DXWG?

@rob-metalinkage I'd also suggest not muddying the waters too much with spatial type (another SDWIG issue) or general parameters like temporal position.

lvdbrink commented 6 years ago

Adding CRS to the conneg params is more in scope for the SDWIG than DXWG?

Well, we at the SDWIG have been waiting to see if this will be solved by content negotiation by profile.

azaroth42 commented 6 years ago

the requirement is that negotiation by Crs is orthogonal and can coexist with negotiation by profile

By this, it seems that you are saying that coordinate reference systems are NOT amendable to being specified by a profile, if there is a requirement that they are orthogonal to profile?

And conversely, if the CRS can be described in a profile, you have two ways to negotiate for CRS -- whatever the orthogonal method is, and by profile.

rob-metalinkage commented 6 years ago

@azaroth42 - I think profiles should be able to specify CRS - its typically a constraint on data values so its a simple case (compared to, for example, profiles defining service level agreements or other dynamic behaviour)

@lvdbrink is, i think, reiterating the point I was trying to make, in that we shouldnt expect or require every domain to make up its own approach to negotiation over variable aspects of data distribution.

SDWIG should concern itself with the governance of identifiers, but not the nature of the mechanism.

Then we end up in the "hard- vs soft-typed" ( or specialised vs qualified forms of predicates) decision.

Given HTTP already has named headers for different dimensions of data variability - encoding, language then perhaps we are looking to establish best practice

maybe prefix any dimension as Accept-X:value in a requirement , and respond with X:value

and and have a register of X, with the owners of X responsible for defining how X is governed - any URI or a register of allowed values (like IANA mime types)

azaroth42 commented 6 years ago

Then you can negotiate for CRS by profile, and the use case doesn't necessitate the requirement.

lvdbrink commented 6 years ago

@lvdbrink is, i think, reiterating the point I was trying to make, in that we shouldnt expect or require every domain to make up its own approach to negotiation over variable aspects of data distribution.

Yes, exactly.

rob-metalinkage commented 6 years ago

@azaroth42 We could nail everything down to very specific profiles - but in general it would be better to profile the content - what it contains - and negotiate the most convenient representation - encoding, language, CRS, version, precision etc - otherwise there is a combinatorial explosion of things that are implementation specific choices, and perhaps transformable, whereas the profile should really be about the community of practice's intent to interoperate - and hence not necessarily transformable.

This fits best with existing practice where we already have negotiation over some aspects - what we are doing is adding either another aspect, or an extension point. (This Use Case is intended to register the requirement that other aspects are considered - we might still end up choosing a very specific non-extensible mechanism, but best not to do it out of ignorance of such use cases :-)

kcoyle commented 6 years ago

This seems to be an example of a standard that the data adheres to. Not sure why this is singled out; I can think of other standards that folks might want to select on, from language (of the instance data), coding rules (cataloging rules in library speak), various statistical measures, and a whole host of related geographic coordinate measures. Maybe a generalizable way to indicate key standards?

dr-shorthair commented 6 years ago

@kcoyle - CRS is a key (though subtle) aspect of almost all spatial data ... which is most data.

The string of coordinates is often 99% of the message in spatial data. The choice of CRS does not change the meaning or structure of this, only the actual numbers. So it is very similar to encoding/serialization or lang. So if we agree with the proposition that 'lang' is an appropriate HTTP conneg facet, then it is hard to argue that CRS is not.

But wouldn't this require another RFC?

dvh commented 6 years ago

If I understand it correctly, the whole Profile idea is to wrap all kinds of conneg things into one 'profile', correct? The problem I see comes from a developer point of view, where this can become some sort of wildcard to support anything. I think this makes it very difficult for clients to identify which options are actually supported and for servers to make clear which options are going to be implemented and to validate all these options.

As an API designer, the reason I want to support multiple options (with their own names) is to clarify documentation, automate generation of client and server libraries, etc. For example, two different ways to retreive active bugs from a collection of issues, sorted by title using an API could be modelled like this:

  1. /issues?type=bug&status=active&orderBy=title
  2. /issues?query="WHERE type='bug' AND status='active' ORDER BY 'title'

The latter option would work for very complex and flexible systems where you can define your own properties etc. However, it becomes almost impossible to use generic tooling to describe all possible options other than "the API supports a query parameter and you can put some aspect of some query language in it".

The first option, however, is very easy to document, understand, validate and implement for both clients and servers. In my opinion the same applies to content negotiation: separate request and response headers look like the first approach, where wrapping all non-existing conneg headers in one single Profile header looks more like the second approach.

That said, besides CRS there might be more spatial options influencing the response/request payloads, like the precision of the coordinates. This might be a reason to introduce a more generic Accept-Geo or Profile-Geo header instead of using Accept-Crs, but moving it all to the Profile header could mean there will be countless options possible using Profile.

By the way, the reason I've opened the discussion again is because we'd like to adhere with global standards instead of creating our own headers. However, at the moment I'm still not convinced by solving CRS negotiation using Profile. Just like Accept-Language and Content-Language, using Accept-Crs and Content-Crs seems to be the most practical solution for us at the moment (I believe the X- prefix isn't required anymore, but I don't know the RFC by heart ;-)).

prushforth commented 6 years ago

@dr-shorthair

The choice of CRS does not change the meaning or structure of this, only the actual numbers. So it is very similar to encoding/serialization or lang.

Strongly agree here. What is missing from CRS (IMHO) is scale, which is equally important in interpreting/using the information in the message. Which brings up the binding together of scale and CRS in the TCRS concept of MapML.

MapML takes the approach of defining its own domain of values for "projection"/ TCRS. The register of values and their definition is in the specification, so it's self-contained / descriptive.

With language, there exists an accepted standard for the domain of values, which is not the case for CRS.

I've never seen someone inventing their own language code, but I'm pretty sure this is a regular occurrence for CRS.

kcoyle commented 6 years ago

@dr-shorthair Actually, I don't think we are in agreement that language is something we would select on. I was using that as an example of something that would be key for some people, as CRS is for spatial data. There is potentially a wide range of standards that folks may want to retrieve on, which makes this a difficult problem. Also, I don't agree that "spatial data ... which is most data". I think you are speaking from within your own environment, for which that is the case. In mine, there is no spatial data but there are many datasets.

larsgsvensson commented 6 years ago

@agreiner scripsit:

There is no capability for reasoning about combinations of q values.

Can you please expand a bit on this? I don't quite follow what you mean by "reasoning about combinations of q values".

larsgsvensson commented 6 years ago

@prushforth scripsit

MapML takes the approach of defining its own domain of values for "projection"/ TCRS. The register of values and their definition is in the specification, so it's self-contained / descriptive.

Despite having spent two years on the SDW WG, I'm obviously not yet familiar with all gespatial lingo... What exactly is a TCRS? I guess it's some kind of Coordinate Reference System, but I can't make out the T.

The register of values and their definition is in the specification, so it's self-contained / descriptive.

Who manages that registry and what is the process of adding new values?

With language, there exists an accepted standard for the domain of values, which is not the case for CRS.

I've never seen someone inventing their own language code, but I'm pretty sure this is a regular occurrence for CRS.

As with all conneg related things, there are registries of allowed/accepted values (managed by IANA) and there are process in place to add new values (e. g. http headers or language tags) to them. Is there a canonical registry for CRSs?

larsgsvensson commented 6 years ago

And finally a general comment: Yes, it might be that geospatial data has too many dimensions so that it's not possible to put all of it into a single profile, since a profile can only cater for a single value in each dimension (e. g. crs=CRS84 and scale=1:100,000 or crs=CRS84 and scale=1:50,000) -- CRS and scale being orthogonal -- whereas it might be more convenient to have different Accept-headers for those two dimensions. The main thinking (at least my main thinking) behind profiles is that we need something to convey additional constraints and semantics beyond what media types already offer and that profiles are orthogonal to media types. Another idea is that it's possible to create profiles that are unions of other profiles and that might be an approach to the geospatial use cases. If we say that urn:example:1 is a profile that says "Coordinates are in CRS84",urn:example:2 is another profile that says "scale 1:50,000" and urn:example:3 says "scale 1:100,000". Then we could create urn:example:4 as the combination of urn:example:1 and urn:example:2 (i. e. CRS84 and 1;50,000). The use of such a mechanism (composite profiles) is heavily discussed (#212, #216 and #217 come to mind). Comments on the usefulness of such an approach are most welcome.

prushforth commented 6 years ago

@larsgsvensson

What exactly is a TCRS?

It's a "Tiled Coordinate Reference System", a term created for MapML.

Who manages that registry and what is the process of adding new values?

When the specification is more mature, we'll register it as a media type with IANA. Updates will hopefully happen as more values are defined.

Is there a canonical registry for CRSs?

The closest thing to this that I know of is the EPSG Registry, but it doesn't contain all values, notably CRS84 is not in there. You can get a new value created / old value maintained by contacting the registry maintainer (OGP).

azaroth42 commented 6 years ago

I think that the discussion has veered into weeds that are not in our garden.

The use case is, as far as I understand, that there are other methods of negotiating which standards are to be used in fulfilling the request, and the requirement is that profile negotiation must coexist safely with these other methods. Whether it is a coordinate reference system, or language, or format is flavor to the use case but completely irrelevant to the requirement.

The question, in my mind, boils down to: If the profile asserts X and another negotiation dimension asserts NOT X, then which of the two wins? This discussion has come up before in #261.

rob-metalinkage commented 6 years ago

Other people's nearby weeds do tend to invade however :-)

IMHO This discussion has been very useful - I have updated the Use Case to include discussion of some of the points raised, and explicitly teased out requirements that I think we do not otherwise have well enough expressed. You may disagree with these .. they constitute a workable interpretation of behaviour of current OGC services.

(I also think of a Use Case around crowd-sourced data such as Open Street Map - anyone can add a language translation to a label at any time - i dont think it should be necessary to update a DCAT record for the dataset for every single change (potentially millions a day) - but perfectly reasonable to state a profile constraint that for a given project a label in the official languages of the country is provided.

"Generally speaking, the software provisioning a dataset will determine a range of transformational capabilities independent of the dataset (and any interoperability profile the data itself conforms to) - encoding, CRS, precision, and possibly even language. Thus dimensions of the data may be constrained by profiles, but additional options left to service based distributions.

If a client overrides aspects explicitly declared in a profile, it is responsible to ensure that any data is not declared to conform to that profile."

Requirements now include:

dr-shorthair commented 6 years ago

@rob-metalinkage I strongly suggest removing the reference to aspects other than CRS from this UC - i.e. delete

Others in the spatial domain would include geometry type (point, bounding box, outline polygon, complex polygon with islands and holes) and spatial precision. Temporal dimensions are also relevant - last known position, version no etc.

These are separate concerns and only muddy the waters on what can be a clean issue.

dr-shorthair commented 6 years ago

@larsgsvensson wrote

geospatial data has too many dimensions so that it's not possible to put all of it into a single profile, since a profile can only cater for a single value in each dimension (e. g. crs=CRS84 and scale=1:100,000 or crs=CRS84 and scale=1:50,000) -- CRS and scale being orthogonal -- whereas it might be more convenient to have different Accept-headers for those two dimensions.

Yes - I think this is the case. CRS in particular is in principle independent of all other representation concerns, so is highly suitable for HTTP conneg.

However, there are representation standards ('profiles') that constrain the CRS - for example, GeoJSON specifies that the only acceptable CRS is WGS84 lon-lat. So it is necessary to specify what to expect when the value for Accept-CRS is different to the value specified within a standard indicated in Accept-Profile or dct:conformsTo.

rob-metalinkage commented 6 years ago

@dr-shorthair final statement above is a good succinct explanation - I have edited the Use Case to make this aspect more explicit.

This is our concern because profiles may specify constraints over multiple negotiable aspects of data - we should define a behavior that any aspect may adhere to, and hence co-exist with our own profile negotiation mechanism. We dont need to worry too much about the specific aspect, its more a requirement to allow for this in general - CRS just happens to be one where a standards organisation is currently seeking guidance!

larsgsvensson commented 6 years ago

@dr-shorthair scripsit:

So it is necessary to specify what to expect when the value for Accept-CRS is different to the value specified within a standard indicated in Accept-Profile or dct:conformsTo.

Yes, there has been some general discussion on preference order of accept-headers. The Memento spec for instance says that time-based negotiation has to be done before any other negotiation. The Apache server specifies that content-type goes first, then language then charset then encoding. There has been some discussion on the preference of {{Accept-Profile}} over the {{profile}} parameter in the {{Accept}} header over in #261 (but no conclusion so far).

jabhay commented 6 years ago

+1 to handling negotiation by profile and negotiation by CRS as separate concerns.

Where did we get to in terms of whether negotiation by CRS should be progressed by SDWIG or DXWG? @rob-metalinkage: I take it your suggestion is DXWG given you raised this issue here? How do others feel?

@dr-shorthair makes a solid suggestion to consider an RFC to add negotiation by CRS, perhaps starting from @RubenVerborgh and @larsgsvensson's profile negotiation RFC. It probably doesn't matter which group an RFC came from, but it may be beneficial for both to be involved in its drafting. I'd be more than happy to work on this with someone from DXWG if it meant progressing the work.

Thoughts?

rob-metalinkage commented 6 years ago

I think the CRS aspect should be addressed in the OGC domain - but the profile negotiation work in DXWG needs to recognise such things may exist and also be specified in profiles - so needs to have a suitable mechanism defined.

agreiner commented 6 years ago

Sorry this is so late after the question was asked (I've been on vacation), but @larsgsvensson was wondering what I meant when I said that there is no capability for reasoning about combinations of q values. What I mean to point out is that every dimension that is negotiated in conneg is treated separately, so it is impossible to make requests along the lines of "I prefer data that conforms to DCAT-AP.DE if the language is German but I prefer data that conforms to DCAT-AP if the language is Italian" or "I want this CRS at the 1:100 scale and a different CRS at the 1:50000 scale." The more things that get moved into conneg, the more requests are limited to considering all those dimensions independently. Browsing becomes even more complex when you consider that certain bits of data may be entirely missing because of a profile choice. A user might prefer data in a certain profile generally but prefer a different profile if taking their usual first choice means that certain fields that are key for a particular request are missing.

rob-metalinkage commented 6 years ago

This was discussed in the conneg sub group today - and will be put as a recommendation to the next plenary - with any improvements in wording developed by consensus here in the interim.

https://www.w3.org/2018/08/15-dxwgcneg-minutes.html

@jabhay has indicated that there is an interest in taking the actual CRS negotiation forward as an OGC activity, so this Use Case has immediate relevance and the stakeholders can help review each other's approaches for compatibility.

dr-shorthair commented 6 years ago

@agreiner you make very valid points, and your examples are very plausible. Some dimensions are always orthogonal, some dimensions are used in combination for some applications. There is no single correct answer and thus no single mechanism. And the dimension may be specified using

  1. HTTP conneg
  2. key-value pairs in a URI query string
  3. within a standard or profile that specifies many dimensions and other aspects
  4. ... other ways ...

There are solid precedents for all of these. I think the purpose of this thread is to recognise that CRS is often, though not always, an independent query dimension and thus might be implemented using HTTP conneg.

lvdbrink commented 6 years ago

@jabhay it would also be worth taking the Dutch Kadaster's work into account. They have a working implementation of CRS conneg. @dvh can tell you all about this! It would be great if we can write up an RFC for CRS conneg and I would be happy to help bring this forward.

jabhay commented 6 years ago

@lvdbrink: will do.

azaroth42 commented 6 years ago

Completing action from CNEG TF to wordsmith requirements:

  1. Profile negotiation mechanisms must safely co-exist with negotiation over other dimensions.
  2. If conformance to a profile is declared, then negotiation for that profile must result in a representation that conforms to the profile's constraints, unless further negotiation dimensions require otherwise.
  3. Merged into 2 and 4.
  4. If conformance to a profile is declared and a request is made with a set of content negotiation dimensions including that profile, then where there are are conflicts between more specific dimensions and less specific dimensions (such as the profile), the more specific dimensions take precedence. (All other constraints must still be met, to meet the requirements of 2)
larsgsvensson commented 6 years ago

Very nice, @azaroth42. Would you say that the definition of which dimensions are "more specific" and "less specific" is for the implementer to decide or something we should specify in the RFC or in the CNEG guidance document?

nicholascar commented 6 years ago

Shouldn't the RFC doc do this to ensure that any users of the HTTP mechanics that don't encounter the W3C work know what to do?

Of course I think we can go much broader in the W3C doc than the RFQ (hence my broadening updates to https://w3c.github.io/dxwg/conneg-by-ap/) but the RFQ should definitely head off any HTTP potential conflicts we can foresee.

larsgsvensson commented 6 years ago

@nicholascar scripsit

Shouldn't the RFC doc do this to ensure that any users of the HTTP mechanics that don't encounter the W3C work know what to do?

OK, so this would be similar to the memento case where the spec says that time-based negotiation goes before any other conneg.

nicholascar commented 5 years ago

Marked as due for closing since proponent accepts UC's remaining point is now represented in Issue https://github.com/w3c/dxwg/issues/893.