metadata101 / iso19139.ca.HNAP

ISO Harmonized North American Profile (HNAP)
GNU General Public License v2.0
4 stars 18 forks source link

Replacing nap.geogratis.gc.ca service with schemas.metadata.geo.ca #118

Closed jodygarnett closed 1 year ago

jodygarnett commented 3 years ago

The http://nap.geogratis.gc.ca/ server is decommissioned making this issue now critical (ie actively broken) and the metadata documents being published cannot be used by external tools. We are looking for a 5 to 10 year decision here, although realistically documents like these do not expire and are useful as a historical record.

Some good ideas from meeting (following the OGC opengis and w3c example layout):

@bo-lu has access to geo.ca domain will request schemas.geo.ca:

The other idea is to setup GitHub pages metadata101.github.io/schemas:

Original: Schema location for hnap version 1.2 and 2.3.1, replacing http://nap.geogratis.gc.ca/

The Metadata configuration configuration validation to remove schema location for validation is required to validate records that have been generated by geonetwork.

image

Notes:

Generated schemaLocation information:

Example:

<?xml version="1.0" encoding="UTF-8"?>
<gmd:MD_Metadata xmlns:gml="http://www.opengis.net/gml/3.2"
                 xmlns:gco="http://www.isotc211.org/2005/gco"
                 xmlns:gmd="http://www.isotc211.org/2005/gmd"
                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                 xmlns:gfc="http://www.isotc211.org/2005/gfc"
                 xmlns:gmi="http://www.isotc211.org/2005/gmi"
                 xmlns:gsr="http://www.isotc211.org/2005/gsr"
                 xmlns:gmx="http://www.isotc211.org/2005/gmx"
                 xmlns:xlink="http://www.w3.org/1999/xlink"
                 xmlns:gss="http://www.isotc211.org/2005/gss"
                 xmlns:uuid="java:java.util.UUID"
                 xmlns:srv="http://www.isotc211.org/2005/srv"
                 xmlns:gts="http://www.isotc211.org/2005/gts"
                 xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://nap.geogratis.gc.ca/metadata/tools/schemas/metadata/can-cgsb-171.100-2009-a/gmd/gmd.xsd http://www.isotc211.org/2005/srv http://nap.geogratis.gc.ca/metadata/tools/schemas/metadata/can-cgsb-171.100-2009-a/srv/srv.xsd http://www.geconnections.org/nap/napMetadataTools/napXsd/napm http://nap.geogratis.gc.ca/metadata/tools/schemas/metadata/can-cgsb-171.100-2009-a/napm/napm.xsd">

With remove schema location for validation true, when validating the xsi:schemaLocation above are removed (and geonetwork is able to validate the file).

josegar74 commented 3 years ago

The problem I think is that online xsd are not updated in http://nap.geogratis.gc.ca/ server, containing old versions.

Alternatively an option can be to define the schema location to use the local xsd's. For example for iso19139, in this test server: https://vanilla.geocat.net/geonetwork/xml/schemas/iso19139/schema/gmd/gmd.xsd, but probably requires some code changes to be able to replace the url of the server as usually this information is defined in the @schema-ident.xml@ file of the schema.

jodygarnett commented 3 years ago

Thanks, I have updated the description to be more clear.

Just to confirm this is an hnap problem, and not something to report to core-geonetwork?

jodygarnett commented 3 years ago

Notes:

Ideas:

jodygarnett commented 3 years ago

Notes:

ianwallen commented 3 years ago

I did do a branch that performs the upgrade. https://github.com/ianwallen/iso19139.ca.HNAP/tree/NAP_upgrade_2013 Not sure if upgrading will cause any issues.

jodygarnett commented 3 years ago

Other examples of "internal" changes that may or may not provide value:

jodygarnett commented 3 years ago

Assigning to @bo-lu and @geothorne to coordinate a course of action:

ianwallen commented 3 years ago

If this issue is fixed then I believe it will also fix issue #89

jodygarnett commented 3 years ago

Discussion on where to publish:

Propose folder structure based on year or version number from HNAP Proposed Updates 2.3.1 document:

napm.xsd has version:

<xs:schema
  targetNamespace="http://www.geconnections.org/nap/napMetadataTools/napXsd/napm"
  xmlns:gmd="http://www.isotc211.org/2005/gmd"
  elementFormDefault="qualified"
  version="2013-02-22"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  xmlns:napm="http://www.geconnections.org/nap/napMetadataTools/napXsd/napm">

gmd.xsd has version:

<xs:schema
   targetNamespace="http://www.isotc211.org/2005/gmd"
   elementFormDefault="qualified"
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   xmlns:xlink="http://www.w3.org/1999/xlink"
   xmlns:gmd="http://www.isotc211.org/2005/gmd"
   version="2012-07-13">
jodygarnett commented 3 years ago

@bo-lu this requires some though and a decision ahead of server being shut off.

jodygarnett commented 3 years ago

@bo-lu from last meeting it seems a service can be setup at this location (once the present server is decommissioned).

We still have this issue capturing that the schema included with the hnap schema plugin does not match what is published.

jodygarnett commented 3 years ago

I would like to propose following the OGC example of publishing schemas by version number:

We can then support the two schema presently known, and allow for future versions:

http://nap.geogratis.gc.ca/schemas/iso19139.ca.HNAP/1.2 http://nap.geogratis.gc.ca/schemas/iso19139.ca.HNAP/2.3.1

jodygarnett commented 3 years ago

A suggestion was made to use an official GC github to host the above.

bo-lu commented 3 years ago

I will ask if we can use: https://github.com/Canadian-Geospatial-Platform

jodygarnett commented 3 years ago

We should try an experiment and then make a recommendation to HNAP team (who has governance over the standard and this location).

jodygarnett commented 3 years ago

Use of github directly is not the best, the file:

Is accessed as:

Could also look at:

bo-lu commented 3 years ago

My take is that we can host the schema files on 'geo.ca' and create a github repository for issue tracking and fixes (i.e., via pull requests).

jodygarnett commented 3 years ago

The the previous server decommissioned this issue is now critical (ie actively broken) and the metadata documents being published cannot be used by external tools. We are looking for a 5 to 10 year decision here, although realistically documents like these do not expire and are useful as a historical record.

Some good ideas from meeting (following the OGC opengis and w3c example layout):

@bo-lu has access to geo.ca domain will request schemas.geo.ca:

As a fall back plan we could setup GitHub pages metadata101.github.io/schemas:

jodygarnett commented 3 years ago

I am checking if osgeo can provide hosting (so we can avoid GitHub in the URL) https://trac.osgeo.org/osgeo/ticket/2594

jodygarnett commented 3 years ago

The new location is shaping up https://github.com/metadata101/schemas

Notes above indicate version 2.3.1(geogratis) and 1.2 (hnap schema plugin) but I am having trouble finding these version numbers in the schema output.

Some work remains for gitub pages to publish xsd files as assets, and to revert some edits that have been made over time (hard coding geogratis links).

jodygarnett commented 1 year ago

Breakout review of https://hnap.schemas.metadata.geo.ca/metadata and comparison with https://schemas.opengis.net/

hnap.schemas.metadata.geo.ca

schemas.opengis.net

Approach:

jodygarnett commented 1 year ago

Canadian standard number format: Number - subject area, standard number, year of publication.

This looks like for external schemas:

If we ignore the 201.100 as assumed:

jodygarnett commented 1 year ago

This results in (if we can drop hnap from subdomain):

Actions:

bo-lu commented 1 year ago

How should we configure it so that geogratis is permanent redirected?

jodygarnett commented 1 year ago

It depends what software is presently serving geogratis and the new service.

  1. If the service is still operational, and using something like apache, then individual files can be redirected to their new schemas.metadata.geo.ca location.
  1. If the service is decommissioned completely a domain redirect can be established to the new location.

3) If the new service is running something like apache then it can be setup to respond to the redirects from the previous domain (step 2 above), and redirect to the new file by file location.

The redirect should respond with a HTTP/1.1 301 Moved Permanently indicating the new location of the file.

We can dig into this once we have new service to experiment with.

Although setting up a web service to redirect files is straightforward; configuring an xml parser to follow such redirects is not very common; and can also be considered a security vulnerability. With this in mind setting up redirects may not reliably allow existing datasets downloaded the public to be validated without modification of the file (or the code reading the file).

jodygarnett commented 1 year ago

Note in HNAP all the schema namesapces are http this does not have to be https as it is only a URI.

The schema location will change from geogratis to a https location.

See: https://stackoverflow.com/questions/30707609/xml-namespace-uri-with-https

jodygarnett commented 1 year ago

Identified a number of metadata/register/ files referenced in code lists and examples:

<gmd:LanguageCode codeList="http://nap.geogratis.gc.ca/metadata/register/napMetadataRegister.xml#IC_116"
  codeListValue="{$mainLanguage}">
**Guideline:** characterSet shall use codelist [napMD_CharacterSetCode]
(http://nap.geogratis.gc.ca/metadata/register/codelists-eng.html#IC_95). Value will be “utf8; utf8”.
jodygarnett commented 1 year ago

Update:

jodygarnett commented 1 year ago

The code list is a URI, however recommended to resolve this as a URL (to match how ISO19139 works):

<gmd:LanguageCode codeList="https://schemas.metadata.geo.ca/metadata/register/napMetadataRegister.xml#IC_116"
  codeListValue="{$mainLanguage}">

This is a change to the content of our records durning migration; previously we were just changing the schemaLocation at the top of the files (not the content).

jodygarnett commented 1 year ago

Branch is here: https://github.com/metadata101/iso19139.ca.HNAP/tree/transition_schemas_metadata_geo_ca

I am testing on 4.2.x before proposing backport.