Review entity (data granule) checks for ISO

jeanetteclark commented 3 years ago

For now, we are refining the FAIR suite so that it evaluates datasets, where a dataset is one or more data granules (entities), and is described and represented by a metadata record.There are several entity-level checks that pointed to resource (dataset) level description information in the ISO xpaths. I think that we should remove these xpaths from the checks (see table below, sorry the formatting is not ideal), but I would like to make sure that if there is a place for this entity level information in ISO, we are capturing it.

Most entity information in ISO seems to be contained in the distribution section. Are there places in this section (or others) for the following information:

entity format (eg: text/csv)
entity identifier (eg: urn:uuid:...)
entity name (eg: samples.csv)

The ISO documents that I've looked at seem to generally not contain these three pieces of information, but I don't think that necessarily means that there isn't a place for it. @tedhabermann If you were documenting a dataset containing two csv tabular data files, each with its own unique identifier and name, where would you put the above information?

Here is the table mentioned above: check	xpath	reasoning
entity.format.nonproprietary	/*/identificationInfo/MD_DataIdentification/resourceFormat/MD_Format	this seems to point to the resource, not the entity specifically
entity.format.nonproprietary	/*/identificationInfo/MD_DataIdentification/resourceFormat/MD_Format/formatSpecificationCitation/CI_Citation/identifier/MD_Identifier/code	points to the resource, not the entity specifically
entity.format.nonproprietary	/*/identificationInfo/MD_DataIdentification/resourceFormat/MD_Format/formatSpecificationCitation/CI_Citation/title	points to the resource, not the entity specifically
entity.identifier.present	//identificationInfo//citation/CI_Citation/identifier/MD_Identifier/code//text()[normalize-space()]	resource identifier and not the entity identifier
entity.identifier.present	//identificationInfo//citation/CI_Citation/identifier/RS_Identifier/code//text()[normalize-space()]	I don't think a reference system is an entity
entity.identifierType.present	//identificationInfo//citation/CI_Citation/identifier/MD_Identifier/codeSpace//text()[normalize-space()]	resource identifier and not the entity identifier
entity.identifierType.present	//identificationInfo//citation/CI_Citation/identifier/MD_Identifier/authority//text()[normalize-space()]	resource identifier and not the entity identifier
entity.name.present	/*/contentInfo/MD_CoverageDescription/attributeDescription/RecordType	entity description, not the entity name
entity.name.present	/*/contentInfo/MI_CoverageDescription/attributeDescription/RecordType	entity description, not the entity name
entity.type.present	/*/metadataScope/MD_MetadataScope/resourceScope/MD_ScopeCode	scope of the resource, not the entity specifically
entity.type.present	/*/hierarchyLevel/MD_ScopeCode	scope of the resource, not the entity specifically

mbjones commented 3 years ago

Relates to https://github.com/NCEAS/metadig-checks/issues/384

jeanetteclark commented 3 years ago

An update: this is done, but still needs to be tested

NCEAS / fairdataone

Review entity (data granule) checks for ISO #10