asascience-open / ncSOS

SOS Service on top of THREDDS
MIT License
15 stars 15 forks source link

Addresses cases where 'parentNetwork' value is not an RA acronym in station's DS response #158

Open kknee opened 9 years ago

kknee commented 9 years ago

comments from @cheryldmorse

"I’ve updated the code so that the ‘parentNetwork’ is not empty for the network’s DS response (failure #1) as long as the ‘institution’ attribute exists in the NetCDF file. But ncSOS will still fail this requirement because the ‘parentNetwork’ may be set to a value that is not a RA acronym. In the NetCDF files that Alex used for his test cases the ‘institution’ attribute was set to a value of ‘Weatherflow Inc.’ which means that the ‘parentNetwork’ was set to ‘WeatherFlow Inc.’ and ncSOS failed the test case. How should we proceed when the ‘institution’ attribute is missing or set to a value that is not a RA acronym?"

and from @abirger "My thought is that we should minimize the changes for now so we could complete the Milestone 1.0 at last, even with some non-critical flaws. Therefore, I believe that it would be quite satisfactory for now, if ncSOS could use for the ‘parentNetwork’ a value from a netCDF ‘institution’ attribute, and ensure that the ‘parentNetwork’ field is not empty in case the attribute is missing. We should definitely resume discussion about this and other issues later within Milestone 2.0 framework."

kknee commented 7 years ago

@abirger, @cheryldmorse and I are defining the next milestone. Do we need to revisit this issue? The crosswalk indicates that institution should be used to define parentNetwork. What happens in the institution is not an RA acronym?

abirger commented 7 years ago

@kknee , @cheryldmorse , formally, ncSOS will fail the describeSensor:IOOS-SOS.DescribeSensor- ResponseContainsValidOperationsMetadataProperty.6, which follows the IOOS Convention requirement for parentNetwork, if the parentNetwork does not contain RA acronym and IOOS codeSpace. On the other hand, the 'institution' attribute designates "the institution of the person or group that collected the data", which is not necessarily an RA. From the IOOS perspective, RAs better fit into 'publisher' category. However, since the 'institution' attribute had been chosen as a source for the parentNetwork from the very beginning, I just assumed that de-facto it always (or, at least overwhelmingly) refers to RA. Is that correct?

cheryldmorse commented 6 years ago

@mwengren, @abirger - Any thoughts on how we should proceed with this issue?

mwengren commented 6 years ago

Here's my logic from trying to step through the documentation to answer this question...

It seems to me that from the SOS SensorML guidelines, we require SensorML responses to return the following \<sml:classifier> elements:

Of these, the relevant classifiers to this issue are 'parentNetwork', 'publisher', and 'sponsor' as each of these are of the type 'organization' in the MMISW codespace http://mmisw.org/ont/ioos/organization.

From the IOOS vocabulary definitions:

Aside: why do we not require an 'operator' classifier in the SensorML as well (see SensorML guidelines )? Instead, 'operator' is a required \<sml:contact> element, mapped from the 'creator_name' netCDF attribute according to the netCDF/SOS crosswalk. It seems it also belongs as another classifier. Maybe it was overlooked? A separate issue beside the point of this a bit, just wanted to mention it.

Unless I misunderstand, each of the 'parentNetwork', 'publisher' and 'sponsor' is required to be one of the values of 'organization' as defined here: http://mmisw.org/ont/ioos/organization in order to pass the conformance tests. So 'sponsors' can only be among the valid organizations in this list? Seems a bit restrictive.

I see a few problems:

  1. the http://mmisw.org/ont/ioos/organization list is too restrictive because it limits both the parentNetwork and sponsor (sponsor being the bigger issue since expected values for that attribute might be more broad - there must be other sponsors of IOOS assets out there other than these organizations). If we expanded that list to include additional organizations, this would solve the issue here with the 'institution' value not being valid once mapped to the 'parentNetwork' classifier, correct?

  2. the definitions of 'institution' and 'creator_name' are a bit ambiguous in ACDD:

    • institution: 'The name of the institution principally responsible for originating this data.'
    • creator_name: 'The name of the person (or other creator type specified by the creator_type attribute) principally responsible for creating this data.'

What is the difference??

Should something other than 'institution' be mapped to 'parentNetwork'? How about 'publisher_name', which currently maps to the \<sml:classifier>\@name='publisher')?

I'm sure this was debated before, but in my thinking 'parentNetwork' is probably more often going to be the data publisher than the 'institution responsible for originating' the data. It should really be a an IOOS-specific netCDF Profile attribute (eg 'parentNetwork') but I don't think changing the required attribution is a path we want to go down at this point probably.

cc: @dpsnowden @kbailey-noaa @emiliom

</end rant>

abirger commented 6 years ago

@mwengren , it’s too late today to start digging but as far as I remember, the “required” elements are (1) the elements that are defined as “globally mandatory” ones by the relevant OGC schemata + (2) the “globally optional” elements that IOOS stockholders considered worthy of being “community mandatory” for IOOS. There is nothing sacral in the second part, just the stockholder’s ideas at the moment when the IOOS SOS v1.0 was discussed. We can review and modify the second part anytime as SMEs think fit.

mwengren commented 6 years ago

As a followup, I discovered today that ncISO has the following mappings in ISO:

netCDF attribute name ISO Element
@creator_name CI_ResponsibleParty (role='orginator | pointOfContact') > gmd:individualName
@creator_url CI_ResponsibleParty (role='orginator | pointOfContact') > gmd:URL
@creator_email CI_ResponsibleParty (role='orginator | pointOfContact') >gmd:electronicMailAddress
@institution CI_ResponsibleParty (role='orginator | pointOfContact') > gmd:organizationName

This means that in ncISO @creator_name is assumed to be a person, who by definition is assumed to belong to @institution.

This differs from our netCDF to SOS mappings table, where @institution maps to parentNetwork classifier, and the various @creator_ attributes map to the <sml:ContactList role='operator'> metadata element.

So we will always be inconsistent in our SensorML metadata and the ISO XML produced by ncISO for netCDF files that use ncSOS, unless either of these change.

If however operator and parentNetwork are synonymous and interchangeable, this wouldn't be an issue. We have different definitions for each though:

Another smaller problem is that all of the ISO metadata from THREDDS/ncISO in our Catalog has <gmd:individualName> element with RA names rather than people. Here's an example.

We may want to consider making use of the @creator_type attribute in ACDD (and also trying to get an update made for ncISO to account for it, which I don't believe it does currently).

creator_type Specifies type of creator with one of the following: 'person', 'group', 'institution', or 'position'. If this attribute is not specified, the creator is assumed to be a person.