Unidata / threddsIso

A THREDDS Data Server extension which generates NCML, a metadata rubric, and ISO 19115.
7 stars 17 forks source link

Remove reference to fake URL for units vocabulary. #14

Open aaron-sweeney opened 9 years ago

aaron-sweeney commented 9 years ago

On line 1568 of UnidataDD2MI.xsl, a fake URL for a units vocabulary is inserted into the ISO metadata record:

1565 <xsl:if test="$variableUnits">
1566   <gmd:units>
1567     <xsl:attribute name="xlink:href">
1568       <xsl:value-of select="'http://example.org/someUnitsDictionary.xml#'"/>
1569       <xsl:value-of select="encode-for-uri($variableUnits)"/>
1570     </xsl:attribute>
1571   </gmd:units>
1572 </xsl:if>

When applied to an example netCDF (ncML) file for a variable having units of 'm', this results in a broken xlink:

<gmd:units xlink:href="http://example.org/someUnitsDictionary.xml#m"/>
geoneubie commented 9 years ago

Anna,

Can we switch to http://unitsofmeasure.org/ucum-essence.xml ?

Dave

On Fri, Jul 17, 2015 at 2:22 PM, Aaron Sweeney notifications@github.com wrote:

On line 1568 of UnidataDD2MI.xsl, a fake URL for a units vocabulary is inserted into the ISO metadata record:

1565 1566 gmd:units 1567 1568 1569 1570 /xsl:attribute 1571 /gmd:units 1572 /xsl:if

When applied to an example netCDF (ncML) file for a variable having units of 'm', this results in a broken xlink:

— Reply to this email directly or view it on GitHub https://github.com/Unidata/threddsIso/issues/14.

amilan17 commented 9 years ago

UDUNITS are required by CF conventions. Perhaps we should provide link to this homepage instead? Unfortunately, I don't see an XML representation, but I don't think that's an issue. http://www.unidata.ucar.edu/software/udunits/

This is my recommended change:

<xsl:if test="$variableUnits">
1566   <gmd:units>
1567     <xsl:attribute name="xlink:href">
1568       <xsl:value-of select="'http://www.unidata.ucar.edu/software/udunits/#'"/>
1569       <xsl:value-of select="encode-for-uri($variableUnits)"/>
1570     </xsl:attribute>
              <xsl:attribute name="xlink:actuate">
                <xsl:value-of select="'onDemand'"/>
              </xsl:attribute>
1571   </gmd:units>
1572 </xsl:if>
aaron-sweeney commented 8 years ago

@amilan17 I don't think 'onDemand' is a valid choice for xlink:actuate. The choices are 'onLoad', 'onRequest', 'other', and 'none'. Which one did you intend?

@cwardgar or @ethanrd or @lesserwhirls The documentation of the UDUNITS database at http://www.unidata.ucar.edu/software/udunits/udunits-2.2.20/doc/udunits/udunits2lib.html#Database discusses a source XML document, but no link is provided. Would it be appropriate and possible for Unidata to provide that XML document as an online resource for UDUNITS definitions?

dopplershift commented 8 years ago

@semmerson any thoughts?

semmerson commented 8 years ago

The link http://www.unidata.ucar.edu/software/udunits/udunits-current/doc/udunits/udunits2.html#Database should show you links to the sub-components of the units database.

aaron-sweeney commented 8 years ago

Thanks for that link @semmerson. In the interest of simplifying the xlink for the units dictionary to a single URL, does a combined XML representation exist?

amilan17 commented 8 years ago

Thanks @aaron-sweeney! I did mean onRequest.

<xsl:if test="$variableUnits">
<gmd:units>
  <xsl:attribute name="xlink:href">
     <xsl:value-of select="'http://www.unidata.ucar.edu/software/udunits/#'"/>
     <xsl:value-of select="encode-for-uri($variableUnits)"/>
  </xsl:attribute>
  <xsl:attribute name="xlink:actuate">
     <xsl:value-of select="'onRequest'"/>
  </xsl:attribute>
</gmd:units>
</xsl:if>
semmerson commented 8 years ago

On Mon, Feb 22, 2016 at 1:41 PM, Aaron Sweeney notifications@github.com wrote:

Thanks for that link @semmerson https://github.com/semmerson. In the interest of simplifying the xlink for the units dictionary to a single URL, does a combined XML representation exist?

Sorry, no. ISO segregates units into categories (base, acceptable, derived, etc.) and the UDUNITS-2 package mirrors this in its XML files. Any merging of the database files into a single URL would have to be done on-the-fly in order to guarantee consistency. I don't know how to do that (but I'll ask).

Regards, Steve Emmerson

semmerson commented 8 years ago

There might be a way. We'll meet Friday to discuss it and I'll let you know.

aaron-sweeney commented 8 years ago

Why don't we follow the "GML Guidance for ISO Metadata" documented on the NOAA Environmental Data Management wiki and follow the approach taken by CSIRO/SEEGRID in their XML realization? Their dictionary follows ISO 31-0 and includes SI base units, SI derived units, a set of units used with SI, and a set of other conventional units. It does not include a separate list of prefix definitions, but it does include conversion factors for specific units (for example, when converting from millivolt to volt). In its favor, this approach adopts an ISO XML schema.

It seems like the separate UDUNITS dictionaries could be mapped to a single dictionary using the CSIRO/SEEGRID XML realization as a guide, or we can simply adopt the CSIRO/SEEGRID XML realization.

zdefne-usgs commented 5 years ago

@semmerson, @aaron-sweeney, @amilan17 Hello, was there a final decision in this issue? xsl still points to a fake url. Also how about he following address for a valid link? https://raw.githubusercontent.com/Unidata/UDUNITS-2/master/lib/udunits2-common.xml#

rsignell-usgs commented 5 years ago

It seems like the separate UDUNITS dictionaries could be mapped to a single dictionary using the CSIRO/SEEGRID XML realization as a guide, or we can simply adopt the CSIRO/SEEGRID XML realization.

@semmerson , wouldn't it be straightforward to have a script that would be run (perhaps by CI) when changes are made to UDUNITS that would generate a single XML representation?

download

semmerson commented 5 years ago

@rsignell-usgs

Probably.

Unfortunately, I don't know how and my NSF funded project is behind schedule.

rsignell-usgs commented 5 years ago

😢

lesserwhirls commented 5 years ago

@rsignell-usgs - for the single XML representation, would merging the individual <unit> elements as children of one parent <unit-system> element be the kind of thing you are looking for? I know that's super simplistic, but easily doable.

zdefne-usgs commented 5 years ago

Why don't we follow the "GML Guidance for ISO Metadata" documented on the NOAA Environmental Data Management wiki and follow the approach taken by CSIRO/SEEGRID in their XML realization?

After reading through it, CSIRO SEEGRID xml realization seems to be the most complete one-stop-shop for linking to units definitions. It needed some update to links and I also change the language to American English (link below). Until a more viable solution is available through UDUNITS I believe we will use this implementation in our model output releases.

https://github.com/zdefne-usgs/ocean-iso-metadata/blob/master/UCUM/CSIRO_SEEGRID_units_am.xml

rsignell-usgs commented 5 years ago

@zdefne-usgs , this doesn't cover units such as "hours since 2000-01-01", right?
What do you think about the proposal of @lesserwhirls:

for the single XML representation, would merging the individual elements as children of one parent element be the kind of thing you are looking for? I know that's super simplistic, but easily doable

.

zdefne-usgs commented 5 years ago

Correct, it doesn't cover that. But I also think that "hours since 2000-01-01" is not correct for units. In this case the units should be "seconds" only, and the reference time should be handled through another parameter or attribute in the netCDF file. So no matter what implementation we use this will be problematic.

CSIRO SEEGRID solution is similar to what the implemented version of @lesserwhirls 's suggestion would be like. If a UDUNITS version were available it would be more desirable...

BobSimons commented 5 years ago

Why don't we follow the "GML Guidance for ISO Metadata" documented on the NOAA Environmental Data Management wiki and follow the approach taken by CSIRO/SEEGRID in their XML realization?

After reading through it, CSIRO SEEGRID xml realization seems to be the most complete one-stop-shop for linking to units definitions. It needed some update to links and I also change the language to American English (link below). Until a more viable solution is available through UDUNITS I believe we will use this implementation in our model output releases.

https://github.com/zdefne-usgs/ocean-iso-metadata/blob/master/UCUM/CSIRO_SEEGRID_units_am.xml

I don't know what role you intend for the UCUM/CSIRO_SEEGRID_units_am.xml document, but I hope you are aware that UCUM is a different units system than UDUNITS. See the two tables starting at https://coastwatch.pfeg.noaa.gov/erddap/convert/units.html#syntaxComparison .

rsignell-usgs commented 5 years ago

For the single XML representation, would merging the individual elements as children of one parent element be the kind of thing you are looking for?

@lesserwhirls, so I think after this discussion, it's clear the answer to this question is YES!!!

lesserwhirls commented 5 years ago

So maybe this?

zdefne-usgs commented 5 years ago

@BobSimons many thanks for the comparison between the two. Neither of them is a units system, though (unlike the SI units system, for example). In this case our interest in them is to electronically communicate quantities together with their units. Apologies for stating the obvious, just wanted to hae the terminology right.

The appeal with the CSIRO realization was that as @aaron-sweeney pointed it follows ISO standards. Also it has the common derived units such as velocity and acceleration already defined in it.

@lesserwhirls Thanks for merging the UDUNITS databases. Is that be a permanent address that we can also use in the future?

In our case both of them do a great job in providing definitions for most of the units, but at the same time neither of them provides complete coverage. We will still have to define some units. For example: in the case of UCUM

in the case of UDUNITS

and of course the "days since.." or "seconds since" etc (for which IMHO again, the unit and the reference time should have been handled separately) need to be defined in either case.

This will take some more thinking to decide which way to go...

semmerson commented 5 years ago

@zdefne-usgs Are there units for "meter per second" and "watts per square meter"?

Apologies if this is a non-sequitur question -- I've been following this discussion only peripherally.

zdefne-usgs commented 5 years ago

@semmerson I am sorry your question is not clear to me. "meter per second" and "watts per square meter" are units themselves.

semmerson commented 5 years ago

@zdefne-usgs Why did you list "meter per second" as a unit that, apparently, isn't defined by UDUNITS2?

zdefne-usgs commented 5 years ago

Those lists are the lists of missing units in each case. So "meter per second" missing in UDUNITS2's realization of the xml file and we will have to define it ourselves.

lesserwhirls commented 5 years ago

@zdefne-usgs I intend on making the location stable once things are settled (I just changed the path slightly). The idea is that the location https://docs.unidata.ucar.edu/thredds/udunits2/current/udunits2_combined.xml will always point to the most recent version, generated at each release of the udunits2 package. One will be able to replace current in the url to a specific udunits version (i.e. 2.2.27.6) and see what the combined xml file looked like at a given release.

I'm not totally set on the format of the xml file quite yet. Comparing:

~https://docs.unidata.ucar.edu/thredds/udunits2/current/udunits2_combined_v1.xml~

with

~https://docs.unidata.ucar.edu/thredds/udunits2/current/udunits2_combined_v2.xml~

the second version tries to keep track of which definition came from which udunits-2 xml file through the use of namespaces, and adds a top level udunits-2 element. Way more verbose, but it's xml. This could always be used as a basis for an HTML page with all of the information, which would be a bit more human readable :-)

zdefne-usgs commented 5 years ago

@lesserwhirls Thanks! I'd pick the second one of course! So for now, I will point to https://docs.unidata.ucar.edu/thredds/udunits2/current/udunits2_combined.xml link and you can change it to the format you pick. How does that sound?

Also can we copy and edit the same file to add the missing units? Or should it be done in a separate file, do you think?

lesserwhirls commented 5 years ago

@zdefne-usgs I was getting the automated publishing workflow down and needed to delete the files to make sure everything was working. I'm going to setup a CI job to handle automated updates, in which I will remove those files temporarily to verify things are working, but other than that, those links should be stable. So, for the current version:

https://docs.unidata.ucar.edu/thredds/udunits2/current/udunits2_combined.xml

and a specific version:

https://docs.unidata.ucar.edu/thredds/udunits2/2.2.27.6/udunits2_combined.xml

Unless anyone sees the need, I'll hold off backfilling the previous versions and will start keeping track at v2.2.27.6 of udunits.