Closed gothub closed 2 years ago
For background info on this check, see https://github.com/NCEAS/metadig-checks/issues/23
Response for failed check: ESS-DIVE recommends the use of non-proprietary file formats where possible. Review the [name file types included] file types included in your dataset and consider changing them to non-proprietary formats.
@emilyarobles Other repositories use this check as well. Can we make the response language agnostic so it can be used across systems?
ESS-DIVE may have other formats to add to the known formats list
@mbjones
After viewing ESS-DIVE metadata examples, I noticed that metacatui inserts the media type into the EML element
/eml/dataset/otherEntity/entityType
, for data objects added via the editor, for example:
<otherEntity id="urn-uuid-86f9dbef-f12f-4802-99f7-b3d05368b5a3">
<entityName>DataONE_FAIR_Quality_Suite.csv</entityName>
<entityDescription>this is a list of d1 checks</entityDescription>
<entityType>text/csv</entityType>
</otherEntity>
The entity.format.nonproprietary
check is only checking the EML elements selected with this Xpath:
/eml/dataset/*/physical/dataFormat/externallyDefinedFormat/formatName/
Should this check also be inspecting `/eml/dataset/otherEntity/entityType' ?
Yeah, that check algorithm sounds possibly incomplete, although it is debatable whether otherEntity/entityType
should be used. For certain I think:
1) //physical/dataFormat/textFormat
are all non-proprietary text formats
2) //physical/dataFormat/binaryRasterFormat
are all non-proprietary BIP and BIL formats
3) //physical/dataFormat/externallyDefinedFormat
may be proprietary or non-proprietary depending on the value found in ./formatName
EML's otherEntity
type can have a physical
section and therefore could be used to describe types as above. It also has the //otherEntity/entityType
field, which is defined as:
The entityType field contains the name of the entity's type. The entity's type is typically the name of the type of data represented in the entity, such as "photograph". This field is used only if this is an 'other' entity and you want to specify the kind of "other" entity this is.
Note that the example for this field is a value like "photograph" or "Photograph" that is uncontrolled and is meant to qualify the "otherness" of otherEntity
. And it is optional. So, while metacatUI seems to put a value there, I think the right location for controlled entity format information is in the //physical/dataFormat
section as described above. That said, if we've been using it consistently for mime-type info, it might be something we should discuss. It may have been used for mimeType info because //textFormat
doesn't have a mime type field AFAICT.
@laurenwalker what are your thoughts on using //otherEntity/entityType
for metadata assessment?
ESS-DIVE has decided to use [entity.type.nonproprietary](https://github.com/NCEAS/metadig-checks/issues/436)
instead of this check.
Description
Check if each entity format is non-proprietary
Priority
Choose a priority for the FAIR suite (Required or Optional)
FAIR: Required
Issues
Procedure