Closed archaeogeek closed 4 years ago
@josegar74 this is also causing problems for other elements- where if there is just an element with a nilreason then it doesn't display in the interface until you go into xml view. It also breaks things like the inflate-metadata check to add a vertical CRS element if none exists. So the real question is how to improve the display of empty elements with a nilreason of inapplicable
@archaeogeek the problem is more related to the lack of child gco:CharacterString
, this works:
<gmd:parentIdentifier
xmlns:gco="http://www.isotc211.org/2005/gco"
gco:nilReason="inapplicable">
<gco:CharacterString></gco:CharacterString>
</gmd:parentIdentifier>
I'm thinking to use inflate-metadata.xsl
to add the gco:CharacterString
if the element has gco:nilReason="inapplicable"
and in update-fixed-info.xsl
remove it if it's empty.
How does it sound?
@josegar74 that sounds like a good approach, as I found quite a lot of elements encoded like this
@josegar74 I think I've managed to figure out what needs to go in inflate-metadata.xsl:
<xsl:template match="//*[@gco:nilReason]">
<xsl:copy>
<xsl:apply-templates select="@* | *"/>
<xsl:element name="gco:CharacterString"/>
</xsl:copy>
</xsl:template>
But I'm not sure what to put in update-fixed-info.xsl. At present, with the above code in inflate-metadata.xsl, I get an error from https://github.com/geonetwork/core-geonetwork/blob/3.4.x/schemas/iso19139/src/main/plugin/iso19139/update-fixed-info.xsl#L270, which makes sense, but I'm not really sure how to fix the problem.
@josegar74 see my attempted fix- it doesn't work unfortunately
I have updated a bit the templates:
<xsl:template match="//*[@gco:nilReason]/gco:CharacterString" priority="10">
<xsl:choose>
<xsl:when test="not(text())">
<xsl:message>Empty: <xsl:value-of select="name()" /></xsl:message>
</xsl:when>
<xsl:otherwise>
<xsl:message>Not Empty: <xsl:value-of select="name()" /></xsl:message>
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- Add gco:Boolean to gmd:pass with nilReason to work nicely in the editor,
update-fixed-info.xsl should removed if empty to avoid xsd errors -->
<xsl:template match="gmd:pass[@gco:nilReason and not(gco:Boolean)]" priority="20">
<xsl:copy>
<xsl:copy-of select="@*" />
<gco:Boolean></gco:Boolean>
</xsl:copy>
</xsl:template>
<!-- Add gco:CharacterString child nodes to elements with gco:nilReason attributes so they display in the editor, then use update-fixed-info.xsl to get rid of them if not required -->
<xsl:template match="//*[@gco:nilReason and not(gco:CharacterString)]" priority="10">
<xsl:copy>
<xsl:apply-templates select="@*[name() != 'gco:nilReason']|*"/>
<xsl:element name="gco:CharacterString"/>
</xsl:copy>
</xsl:template>
With a metadata containing:
<gmd:parentIdentifier xmlns:gco="http://www.isotc211.org/2005/gco" gco:nilReason="inapplicable"/>
The gco:CharacterString
is added and the element appears in the editor, but if saving with empty value the element is saved as:
<gmd:parentIdentifier gco:nilReason="missing">
<gco:CharacterString/>
</gmd:parentIdentifier>
This is due to this template in ISO19139 that sets gco:nilReason="missing"
for elements with empty gco:CharacterString
:
I don't know the difference to use missing
or inapplicable
. Is GEMINI 2.3 using inapplicable
and means to get rid of the child gco:CharacterString
?
Something that can be a bit problematic is the current template in inflate:
<!-- Add gco:CharacterString child nodes to elements with gco:nilReason attributes so they display in the editor, then use update-fixed-info.xsl to get rid of them if not required -->
<xsl:template match="//*[@gco:nilReason and not(gco:CharacterString)]" priority="10">
That code is adding gco:CharacterString
to any element with gco:nilReason
, but that attribute can exists in elements containing for example gco:Real
: http://www.datypic.com/sc/niem21/e-gmd_minimumValue-1.html
Check about missing
/ inapplicable
values, if for GEMINI should be used inapplicable
for all gco:CharacterString
elements and no other type of elements?
If previous is true, we can update inflate templates to match="//*[@gco:nilReason='inapplicable' and not(gco:CharacterString)]"
Otherwise we need to define when should be used missing
/ inapplicable
values and for the inflate process define a list of elements to add the gco:CharacterString
probably.
@josegar74 It's not clear which version Gemini wants us to use- see https://www.agi.org.uk/agi-groups/standards-committee/uk-gemini/40-gemini/1062-gemini-datasets-and-data-series#21 for example where they say you should use a nilreason
of unknown
or inapplicable
. This is actually the only mention of "nilreason" or "nil reason" in the whole of the guidance for datasets- but the example records I have been given use a nilreason
of inapplicable
for other elements too.
So... for now, could we update the inflate templates to match on inapplicable
or unknown
, and only for gco:CharacterString
?
I will seek guidance from the Gemini working group on usage of nilReason
for other elements and we can resolve it later if a problem occurs
@archaeogeek I have committed the changes to inflate this element.
But I notice in 3.8.x 2 issues:
iso19139
schema), I'm investigating the issue:See in https://www.isotc211.org/2005/gml/basicTypes.xsd
In the section 2.2.2 Root element in https://www.agi.org.uk/40-gemini/1048-uk-gemini-encoding-guidance:
xsi:schemaLocation="http://www.isotc211.org/2005/gmx
http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/gmx/gmx.xsd
http://www.isotc211.org/2005/srv
http://inspire.ec.europa.eu/draft-schemas/inspire-md-schemas/srv/1.0/srv.xsd">
That includes the xsd from http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas and use gml 3.2 url for the namespace.
We need to check about this and check if require to update the local xsd in the schema.
About the Add
button, it's related to https://github.com/geonetwork/core-geonetwork/issues/4192
I just fixed in the GEMINI 2.3 3.8.x
branch.
Please check from previous comment about the schema references.
Thanks @josegar74 the basic issue seems to be fixed. What would be the changes needed to fix the problem with schema references?
@archaeogeek in 3.8 version are supported 2005 and 2007 versions of iso19139, I can create a branch with the changes, mainly should be related to the schemaLocation
to use 2007 version that supports gml 3.2.
For early versions, maybe an option is to update also the schemaLocation
and disable in the settings the option to use the local xsd
in the validation. That way will be used the remote xsd, that is slower to validate, but also makes sure to use the remote / updated versions.
@archaeogeek I have pushed the changes directly to 3.8.x
as were quite limited.
Try to create new metadata, if using existing database will require to remove current templates and reload them. The new metadata should not have elements with gml namespace inline.
See https://github.com/AstunTechnology/iso19139.gemini23/commit/c47df29302e70d4e03ae9e04d64889ea65ae7c3a
@josegar74 I've reloaded the templates and the sample data, I think it's working as expected- I'm now looking at some validation errors but I'll add those as separate issues and close this one thanks
Reopening as the fix for this issue does not seem to be working correctly, and is the root of the problems described in https://github.com/AstunTechnology/iso19139.gemini23/issues/42. The problem we have is that mentioned in https://github.com/AstunTechnology/iso19139.gemini23/issues/35#issuecomment-539909785 that the ISO19139 template is adding nilreason=missing
to the elements where an empty characterstring has been added. So:
<gmd:hierarchyLevelName gco:nilReason="inapplicable"/>
becomes:
<gmd:hierarchyLevelName xmlns:gml="http://www.opengis.net/gml/3.2" gco:nilReason="missing">
<gco:CharacterString/>
</gmd:hierarchyLevelName>
but then our code at https://github.com/AstunTechnology/iso19139.gemini23/blob/3.8.x/src/main/plugin/iso19139.gemini23/update-fixed-info.xsl#L98 is not being triggered and validation errors occur because of the empty strings.
I think we need to override the behaviour of the template in ISO19139 which does this, and preserve the correct nilreason but I can't see how to do it?
@josegar74 I've experimented with adding the code from ISO19139 update-fixed-info into the one for Gemini23 (though haven't committed it) but the code that strips the empty characterstring (https://github.com/AstunTechnology/iso19139.gemini23/blob/3.8.x/src/main/plugin/iso19139.gemini23/update-fixed-info.xsl#L98) doesn't seem to be working. It does work in isolation in Oxygen XML though, so I don't know if there's something else going on as well?
I've manage to get it working when the element is empty, with these templates:
inflate-metadata.xsl
<xsl:template match="//*[(@gco:nilReason='inapplicable' or @gco:nilReason='unknown') and not(gco:CharacterString) and name() != 'gmd:verticalElement']" priority="10">
<xsl:copy>
<xsl:apply-templates select="@*|*"/>
<xsl:element name="gco:CharacterString"/>
</xsl:copy>
</xsl:template>
update-fixed-info.xsl
<xsl:template match="//*[(@gco:nilReason='inapplicable' or @gco:nilReason='unknown') and gco:CharacterString]" priority="10">
<xsl:choose>
<xsl:when test="not(gco:CharacterString/text())">
<!-- Copy element and attributes, not children -->
<xsl:copy>
<xsl:apply-templates select="@*"/>
</xsl:copy>
</xsl:message>
<xsl:copy>
<xsl:apply-templates select="@*"/>
</xsl:copy>
</xsl:when>
<xsl:otherwise>
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
So if the metadata has <gmd:hierarchyLevelName gco:nilReason="inapplicable"/>
the editor gets the following:
<gmd:hierarchyLevelName gco:nilReason="inapplicable">
<gco:CharacterString></gco:CharacterString>
</gmd:hierarchyLevelName>
So the field can be edited and when saved, if empty, it's stored properly as <gmd:hierarchyLevelName gco:nilReason="inapplicable"/>
.
The issue is when filling the value, for now the attribute is kept:
<gmd:hierarchyLevelName gco:nilReason="inapplicable">
<gco:CharacterString>value</gco:CharacterString>
</gmd:hierarchyLevelName>
That allows to handle it properly if it's emptied again, but probably not correct. If we remove it:
<gmd:hierarchyLevelName>
<gco:CharacterString>value</gco:CharacterString>
</gmd:hierarchyLevelName>
If it's emptied again, GeoNetwork converts it:
<gmd:hierarchyLevelName gco:nilReason="missing">
<gco:CharacterString></gco:CharacterString>
</gmd:hierarchyLevelName>
I think we need to define a list of elements that should be handle in special way (to add "inapplicable" or "unknown" instead of "missing" value) when are emptied.
@josegar74 it's going to be very hard to list a set of elements and specific nilreasons but here goes:
gmd:version
(in the context of gmd:MD_Format
) should allow either nilReason=inapplicable
or nilReason=unknown
and it's important to keep the correct one
gmd:explanation
(in the context of Conformity gmd:DQ_ConformanceResult
) nilReason=unknown
gmd:pass
nilReason=unknown
See https://www.agi.org.uk/agi-groups/standards-committee/uk-gemini/40-gemini/1062-gemini-datasets-and-data-series#41
BUT there might be other nilreasons in other types of DQ_ConformanceResults for data quality- see https://www.agi.org.uk/agi-groups/standards-committee/uk-gemini/40-gemini/1062-gemini-datasets-and-data-series#52 and I honestly don't know how we'd cover all of those possibilities.
In the sample records at https://github.com/AGIGemini/Schematron/tree/master/samples there are a number of other examples, some of which are actually wrong (temporal extent) and some not in the standard (couplingtype and containsoperations for services) so I'm not sure what to do about those at all.
Would it be possible for some sort of validation check that looks for nilreasons present when the value is filled in? For example in your steps above:
<gmd:hierarchyLevelName gco:nilReason="inapplicable">
<gco:CharacterString>value</gco:CharacterString>
</gmd:hierarchyLevelName>
If the record was validated at this point could we have a rule that would fail in that case?
I think it's fine if the value is removed at that point so that GeoNetwork automatically fills in the gco:nilReason="missing" component:
<gmd:hierarchyLevelName gco:nilReason="missing">
<gco:CharacterString></gco:CharacterString>
</gmd:hierarchyLevelName>
This will fail validation and then people should be able to either fill the value in, delete the element, or add the correct nilReason instead.
There are two points that we need to get to:
1) Someone imports a record that they believe is already Gemini 2.3 compliant into GeoNetwork and does no editing in it- they just use it for the CSW endpoint. In this case the record should not be changed at all by the import process, even if validation is run as part of that process.
2) Someone imports a record into GeoNetwork and then validates it in the editing interface. In this case, they need to be able to add values to existing elements that have a nilReason applied, but if they choose not to, then the end result of editing/validating should not change the nilReason. If however they delete values, or add new ones, then the default behaviour of adding a nilReason=missing
should apply and they can change that if they want.
Does that all make sense?
@josegar74 I think this is the priority issue to fix now, as I can't fully test the conversion scripts from Gemini 2.2 to 2.3 due to the validation errors it causes with empty strings. Could you let me know when you'll be able to look at it?
@archaeogeek I'll focus on this issue tomorrow.
@archaeogeek a question about the following case:
gmd:version
(in the context of gmd:MD_Format
) should allow either nilReason=inapplicable
or nilReason=unknown
and it's important to keep the correct one.
If we have the following in the metadata:
<gmd:version gco:nilReason="inapplicable" />
It's inflated when edited:
<gmd:version gco:nilReason="inapplicable">
<gco:CharacterString></gco:CharacterString>
</gmd:version>
If the user doesn't fill it, it should be stored again as <gmd:version gco:nilReason="inapplicable" />
, that's clear, but suppose it gets filled, then gco:nilReason
will be removed:
<gmd:version>
<gco:CharacterString>1.0</gco:CharacterString>
</gmd:version>
If the metadata is edited again and emptied, we should assign nilReason=inapplicable
or nilReason=unknown
(we don't have anymore the original attribute value)
About gmd:pass
nilReason=unknown
, this is already handled for any element containing gco:Boolean
in https://github.com/geonetwork/core-geonetwork/pull/3569
Related to this:
Would it be possible for some sort of validation check that looks for nilreasons present when the value is filled in? For example in your steps above:
<gmd:hierarchyLevelName gco:nilReason="inapplicable">
<gco:CharacterString>value</gco:CharacterString>
</gmd:hierarchyLevelName>
If the record was validated at this point could we have a rule that would fail in that case?
You mean not to remove the gco:nilReason
in these cases, so the user has to remove them manually?
The kept of the attribute is handle already in https://github.com/AstunTechnology/iso19139.gemini23/blob/799ceb71f6b187381495de47e72d971a7aecb4c2/src/main/plugin/iso19139.gemini23/update-fixed-info.xsl#L106
I was doing a change to remove it when the field is filled, but if prefer the user removes it manually it's another option, let me know:
<xsl:otherwise>
<xsl:message>Not Empty: <xsl:value-of select="name(..)" /></xsl:message>
<xsl:copy>
<xsl:apply-templates select="@*[name() != 'gco:nilReason']|node()"/>
</xsl:copy>
</xsl:otherwise>
Added a validation rule to check for elements with value and gco:nilReason
, so when the user fills the element, the attribute is kept (see the end of the previous comment).
The user can enable the More details
option in the editor and remove it. See attached screencast, if looks fine I'll commit the validation rule.
The other option is to remove the attribute when filled the element, as indicated also at the end of the previous comment.
@josegar74 I think the approach taking in the screencast is fine. For gmd:version
, if it is filled in and then emptied again, so we lose the original nilReason, I think that is fine. I will add a note to the html guidance in the editor to say that people need to add gco:nilReason=unknown
manually.
Note that if filled any element with gco:nilReason='inapplicable'
or @gco:nilReason='unknown'
are not removed due to:
The user can enable manually the More details
option and remove the attribute there.
Going to commit the updated validation rule then.
@josegar74 there still seems to be a problem with the gmd:explanation
element unfortunately. When harvesting from WMS this is set as <explanation gco:nilReason="inapplicable"/>
from https://github.com/AstunTechnology/iso19139.gemini23/blob/3.8.x/src/main/plugin/iso19139.gemini23/convert/OGCWxSGetCapabilitiesto19119/OGCWxSGetCapabilitiesLayer-to-19139.xsl#L411 but when the record is edited this is converted to:
<gmd:explanation gco:nilReason="unknown">
<gco:CharacterString/>
</gmd:explanation>
I'm not sure from the guidance whether unknown or inapplicable is the best case, but it should not be expanded and it fails validation. I thought that https://github.com/metadata101/iso19139.gemini23/blob/3.8.x/src/main/plugin/iso19139.gemini23/update-fixed-info.xsl#L91 should stop this from happening or have I misunderstood what's going on?
@josegar74 Reading through the comments on this thread I can see that we are deliberately expanding elements with gco:nilReason="inapplicable" or "unknown"
here https://github.com/AstunTechnology/iso19139.gemini23/blob/3.8.x/src/main/plugin/iso19139.gemini23/inflate-metadata.xsl#L405-L412
but https://github.com/AstunTechnology/iso19139.gemini23/blob/3.8.x/src/main/plugin/iso19139.gemini23/inflate-metadata.xsl#L405-L412 says it should only be set to gco:nilReason="unknown"
if it is empty, but it's not?
Then I thought empty gco:characterString
elements should be stripped by https://github.com/AstunTechnology/iso19139.gemini23/blob/3.8.x/src/main/plugin/iso19139.gemini23/update-fixed-info.xsl#L114-L126 on validation or saving but that doesn't seem to be happening?
Have I misunderstood something?
The 2on link I think is wrong (it's the same as 1st). To be able to edit the elements in GeoNetwork:
Inflate process adds gco:CharacterString
to elements that have @gco:nilReason='inapplicable'
or @gco:nilReason='unknown'
and doesn't have gco:CharacterString
(the last should not happen, but better to have the check, to be safe).
Update-fixed-info process gets ride of the gco:CharacterString
elements that are empty if the container has @gco:nilReason='inapplicable'
or @gco:nilReason='unknown'
.
Is not working like that? I'm going to re-check, but from yesterday I didn't notice any issue.
I have done the following:
1) Create a new metadata with the Dataset template.
2) Leave empty gmd:explanation
from the Conformity report
3) Save&Close the metadata
4) Open the xml: <gmd:explanation gco:nilReason="unknown"/>
5) Edit again and set a value, save&close. Note that currently gco:nilReason="unknown"
is preserved (the user should remove it manually in the editor if wants to get rid of it, see a previous comment related to this)
<gmd:explanation gco:nilReason="unknown">
<gco:CharacterString>explanation</gco:CharacterString>
</gmd:explanation>
Let me know the tests you're doing, to check if the issue is related to other element(s).
@josegar74 if I replicate your steps, then it's fine. The issue I am having is with harvested records.
<gmd:explanation gco:nilReason="inapplicable"/>
<gmd:explanation gco:nilReason="unknown">
<gco:CharacterString/>
</gmd:explanation>
<gmd:explanation gco:nilReason="unknown"/>
Also, the pass element that goes with this starts off as <gmd:pass gco:nilReason="unknown"/>
and becomes:
<gmd:pass>
<gco:Boolean>false</gco:Boolean>
</gmd:pass>
So even though we're not trying to make any changes to the record in this process, it is being changed simply by activating editing and validation. Should we be looking at overriding the behaviour in ISO19139 that is causing this in the first place?
@archaeogeek I'll check the inflate process, seem not preserving the attribute
@josegar74 thank you, I've tested your latest commit and am getting much better results now. I will need to test with some other records as well, to be sure, but it looks good so far.
I've seen no further issues with this so will close it, thanks
I've got a sample dataset with the following xml for parentidentifier:
<gmd:parentIdentifier xmlns:gco="http://www.isotc211.org/2005/gco" gco:nilReason="inapplicable"/>
This doesn't trigger the "add parent record" button. If the element is removed entirely, then the button appears. Is it possible to either show the nilreason or trigger the button in this scenario?