Open GregoryJPavier opened 2 months ago
Hi Bill Roberts pointed me at your query here.
I think the issue is that you have broken the triple PMD requires to link a catalog entry to the qb:DataSet
(i.e. the resource that contains the observation data through the <obs> qb:dataSet <dataCube>
relation).
Basically pmdcat:datasetContents
and the corresponding pmdcat:DataCube
type are used to tell PMD where it can find the data to render with the datacube viewer. i.e. it's best to consider these vocabulary items as PMD specific rendering hints, rather than terms that carry broader semantics. Put another way pmdcat:datasetContents
doesn't mean "these are the dataset contents", it means "pmd when a user is looking at this catalog entry try and render a UI for this resource".
So I think you most likely want to construct data like this:
flowchart TD
CE[#data-catalog-entry] -->|another:predicate| DS[#dataset]
DS --> |dcat:distribution| QB[#qbDataSet]
CE -->|pmdcat:datasetContents|QB
QB -->|rdf:type| ClassPQB[pmdcat:DataCube]
QB -->|dcat:isDistributionOf|DS
QB -->|rdf:type| ClassQB[qb:DataSet]
QB -->|rdf:type| ClassDDS[dcat:Distribution]
Obs(obs 1..N) -->|qb:dataSet| QB
Where another:predicate
is an appropriate property of your choosing and if you can't find one you can always coin something like ons:datasetContents
.
Bug Report - Pmdification resulting in empty datasets on PMD
On PMD, new drafts pushed through the pipeline were appearing with all their code-lists intact, but without any actual observational data.
Turns out the culprit is the pmdification step in the pipeline, which uses the csvcubed-pmd library’s pmdify script.
Essentially, a ‘barrier’ is being created within the graph relationship of the datasets catalog-entry. The structure of this relationship has changed due to updates to csvcubed, but this pmdify script has not been brought up to parity.
Pmdify still assigns the pmdcat:DataCube to dcat:Dataset but it needs to be assigned to the dcat:Distribution. Note: do not screw up code lists! Code lists do not use dcat:Distribution!
So, because of this change in structure the pmdify script is no longer taking everything in, as if it’s not reaching the cube#Dataset, and hence the observational data.
Below are visual representations of the data structure before the updates, afterwards, and the fixed version we're aiming for.
Old structure diagram
New structure diagram
Fixed structure diagram
The Code
We believe the issue lies within the
_get_catalog_entry_from_dcat_dataset
function in the pmdify script.This section in particular may be of interest as this is where the code is assigning values to the
pmdcat_dataset
variable, which may need to be updated.