Open jmckenna opened 1 year ago
Hello Jeff, I had a quick look, it seems our issue can be solved quickly;
sitemap: https://metadata.naturalsciences.be/geonetwork/srv/api/sitemap
HTML json-ld embedded info (taken from the sitemap):
Does that work for you? Cheers
Thanks for the quick changes @ndevilleBE !
With a quick glance, the sitemap looks good, but the embedded JSON-LD gives an error in the validator, in the Distribution
section, because of a missing value for the name
property:
{
"@type":"DataDownload",
"contentUrl":"https://www.marineatlas.be",
"encodingFormat":"WWW:DOWNLOAD-1.0-http--download",
"name": ,
"description": "An HTTP link to download: MarineAtlas website" }
,
I was testing with this record: https://metadata.naturalsciences.be/geonetwork/srv/api/records/9f4131b6-7895-403a-a17e-6bb33befaf16?language=all
Here is my second record test, which also has an error (it is set both as "@type": "schema:Dataset"
and "@type": "schema:WebAPI"
) : https://metadata.naturalsciences.be/geonetwork/srv/api/records/mean_wave_direction_TS?language=all
Speaking with @fils, ODIS unfortunately rejects any JSON-LD that has an error.
Hmm.
(validator used: https://validator.schema.org/ )
https://metadata.naturalsciences.be/geonetwork/srv/api/records/mean_wave_direction_TS?language=all It does now pass the schema.org test. I'm checking the other issue. Do you have a way to test all metadata at once? Cheers
Hello Jeff, FYI: https://catalogue.odis.org/view/3271 Cheers
@ndevilleBE perfect, thanks!
Hello @jmckenna, We added the missing term "name" in the JSON-LD metadata information. It's in dev mode only for the time being as this is done with other improvements on our side. We will push it to production this week or the week after if no surprises occurs. Cheers
Hello @jmckenna Could you send me the metadata file with all the empty @id in the json-ld representation? I believe it is generated by Geonetwork but I need to verify it. Tanks,
Hey @ndevilleBE
In the meeting we were examining this record, I have pasted its JSON-LD below:
{
"@context": "http://schema.org/",
"@type": "schema:Dataset",
"@id": "https://metadata.naturalsciences.be/geonetwork/srv/api/records/bmdc.be:dataset:2721",
"includedInDataCatalog": [
{
"url": "https://metadata.naturalsciences.be/geonetwork/srv/search#",
"name": ""
}
],
"inLanguage": "eng",
"name": "3D voxel model of the Belgian Continental Shelf",
"dateCreated": [
"2022-07-26T13:19:56Z"
],
"dateModified": [
"2022-07-28T12:05:11Z"
],
"datePublished": [],
"thumbnailUrl": [],
"description": "Three-dimensional voxel model of the geological subsurface of the Belgian Continental Shelf containing information on probabilities of lithological classes (2: clay, 3: silt, 5: fine sand, 6: medium sand, 7: coarse sand and 8: gravel) and stratigraphy (1: Upper Holocene Nearshore, 2: Upper Holocene Offshore, 3: Lower Holocene, 4: Pleistocene and 5: Paleogene), estimated percentages of lithoclasses (clay, silt, mud, fine sand, medium sand, coarse sand, gravel and shells), and uncertainties (borehole density, entropie, positional quality, sampling quality and vintage).",
"keywords": [
"Geology",
"Geo-Seas Udden-Wentworth scale",
"Belgian part of the North Sea",
"Belgian Exclusive Economic Zone"
],
"author": [
{
"@id": "sumomdo@naturalsciences.be",
"@type": "Organization",
"name": "Royal Belgian Institute for Natural Sciences (RBINS), Directorate Natural Environment (OD Nature), Suspended Matter and Seabed Monitoring and Modelling (SUMO)",
"email": "sumomdo@naturalsciences.be",
"contactPoint": {
"@type": "PostalAddress"
}
}
],
"contributor": [],
"creator": [],
"provider": [
{
"@id": "bmdc@naturalsciences.be",
"@type": "Organization",
"name": "Royal Belgian Institute for Natural Sciences (RBINS), Directorate Natural Environment (OD Nature), Belgian Marine Data Centre (BMDC)",
"email": "bmdc@naturalsciences.be",
"contactPoint": {
"@type": "PostalAddress",
"addressCountry": "Belgium",
"addressLocality": "Brussel",
"postalCode": "1000",
"streetAddress": "Vautierstraat 29"
}
},
{
"@id": "bmdc@naturalsciences.be",
"@type": "Organization",
"name": "Royal Belgian Institute for Natural Sciences (RBINS), Directorate Natural Environment (OD Nature), Belgian Marine Data Centre (BMDC)",
"email": "bmdc@naturalsciences.be",
"contactPoint": {
"@type": "PostalAddress",
"addressCountry": "Belgium",
"addressLocality": "Brussel",
"postalCode": "1000",
"streetAddress": "Vautierstraat 29"
}
}
],
"copyrightHolder": [],
"user": [],
"sourceOrganization": [],
"publisher": [],
"distribution": [
{
"@type": "DataDownload",
"contentUrl": "https://www.bmdc.be/NODC/ditsAttach/datasource/7296/Belspo%20TILES_BE_20191014_ALL%20VARS.asc",
"encodingFormat": "WWW:DOWNLOAD-1.0-http--download",
"name": ,
"description": "An HTTP link to download the dataset in CSV: 3D voxel model of the Belgian Continental Shelf (October 2019, all variables). BELSPO TILES Consortium"
},
{
"@type": "DataDownload",
"contentUrl": "https://www.bmdc.be/NODC/ditsAttach/datasource/7298/Belspo%20TILES_BE_DSS%20Export_2020.asc",
"encodingFormat": "WWW:DOWNLOAD-1.0-http--download",
"name": ,
"description": "An HTTP link to download the dataset in CSV: 3D voxel model of the Belgian Continental Shelf (2020, Export decision support, main variables). BELSPO TILES Consortium"
}
],
"encodingFormat": [
"text/csv"
],
"spatialCoverage": [
{
"@type": "Place",
"description": [],
"geo": [
{
"@type": "GeoShape",
"box": "50.89 1.31 52.09 3.68"
}
]
}
],
"temporalCoverage": [
"2018-01-01/"
],
"license": [
"https://creativecommons.org/publicdomain/zero/1.0/",
{
"@type": "CreativeWork",
"name": "The data may be used and redistributed for free but is not intended for legal use, since it may contain inaccuracies. Neither the data Contributor, nor any of their employees or contractors, makes any warranty, express or implied, including warranties of merchantability and fitness for a particular purpose, or assumes any legal liability for the accuracy, completeness, or usefulness, of this information."
},
{
"@type": "CreativeWork",
"name": "No limitations on public access."
}
]
}
We did check another dataset before. In this one I don't see empty @id which were an issue for your colleagues
To be honest, I can't find the exact record, however, I do see that 95 records have errors when ODIS tries to harvest these JSON-LD. I wonder if you/we can tackle removing the empty parameters first, when there is no value (name
, description
, copyrightHolder
, user
, sourceOrganization
, publisher
, etc) and then maybe I can easier find the missing @id
record.
Ok no worries. I'll let you know when the new metadata version is updated without empty fields.
Good morning @jmckenna , As I mentioned to you I'll be absent for 2 months. To avoid blocking the ingestion of our metadata in your portal, I put you in contact with my colleague Thomas Vandenberghe (@tvandenberghe), who is managing the ISO XML metadata generation. He'll let you know when the new version is available with the corrections so you can run a test on everything again. Thanks
Thanks @ndevilleBE, will watch for updates from Thomas. Enjoy your break.
Hi @jmckenna. Our harvester is updated and now contains 'name'. At https://metadata.naturalsciences.be/geonetwork/srv/api/records/9f4131b6-7895-403a-a17e-6bb33befaf16 . It is just now that I see your full list of required fields, and these are still not there:
{
"contributor": [],
"copyrightHolder": [],
"datePublished": [],
"publisher": [],
"sourceOrganization": [],
"spatialCoverage": [
{
"@type": "Place",
"description": [],
"geo": [
{
"@type": "GeoShape",
"box": "51.0937 2.292 51.527 3.27217"
}
]
}
],
"user": []
}
I need to figure out how GeoNetwork populates these fields and what maps to them: 1) from GN system settings, or 2) original ISO XML? 3) hardcoded in the JSON-LD generation. We are planning on forking GN, which can help in figuring out 1) and 3)
Everyting happens in GeoNetwork, and we have primary control over the content of all fields.
"contributor": [],
-> gmd:identificationInfo//gmd:pointOfContact/[gmd:role/gmd:CI_RoleCode/@codeListValue='processor']
-> this depends a lot on the situation and may need changes at record level
"copyrightHolder": [],
-> gmd:identificationInfo//gmd:pointOfContact/[gmd:role/gmd:CI_RoleCode/@codeListValue='owner']
-> RBINS
"datePublished": [],
-> gmd:identificationInfo//gmd:citation//gmd:date[/gmd:dateType//@codeListValue='publication']//gmd:date//text()
-> add publication date as well
"publisher": [],
-> gmd:identificationInfo//gmd:pointOfContact/[gmd:role/gmd:CI_RoleCode/@codeListValue='publisher']
-> RBINS
"sourceOrganization": [],
-> gmd:identificationInfo//gmd:pointOfContact/[gmd:role/gmd:CI_RoleCode/@codeListValue='principalInvestigator']
-> similar to author now, but depends a lot on the situation and may need changes at record level
"spatialCoverage/description": [],
-> gmd:identificationInfo//gmd:extent/[gmd:geographicElement] foreach gmd:description[count(.//text() != '') > 0]
-> when empty, filled with 'Bounding box'
"user": []
-> gmd:identificationInfo//gmd:pointOfContact/[gmd:role/gmd:CI_RoleCode/@codeListValue='user']
-> I don't see the point of us completing this. According to https://wiki.esipfed.org/ISO_19115-3_Codelists#CI_RoleCode it is someone who uses the resource. Isn't that you, ODIS?
I will make the necessary adaptations and get back to you.
@tvandenberghe thanks for your fix for the Distrubution name
. I will try to re-index your endpoint into ODIS and report back.
Regarding additional properties (that you mention above), those are optional, and it is up to you whether to include/expose or not. I'd say wait for my report on our re-index first, before trying to add additional properties (it could open up a new 'can of worms'). More soon...
@tvandenberghe initial harvesting results of your endpoint can now be found here
That's really cool. One issue is that https://spatial.naturalsciences.be/geoserver/idod/ows?version=1.3.0&service=WMS&request=GetCapabilities is rendered with formatted ampersands, making the url lead to nothing. Also, our url does not explicitly refer to a single layer but a getcapabilities description of a whole namespace: gmd:onLine/gmd:URL and the layer name is included is gmd:onLine/gmd:name. Would it be possible to render the distributioninfo as a complex object with name included. We likely won't be the only ones doing it like this (this way gives the cleanest rendering in GeoNetwork).
@tvandenberghe @ndevilleBE the issue with Distribution
empty name
is back, see this record :
"distribution": [
{
"@type":"DataDownload",
"contentUrl":"http://www.vliz.be/en/catalogue?module=ref&refid=41493",
"encodingFormat":"WWW:LINK-1.0-http--link",
"name": ,
"description": "An HTTP link to view information on: Analyse van de levensgemeenschappen op het Belgisch continentaal plat: Studie van de epibenthale biocoenoses en van de demersale Pisces in en rondom de baggerzones . D. Maertens" }
,
Can you take a look?
BMDC team to examine:
sitemap.xml
possible issues to examine together:
cc @ndevilleBE