NCEAS / metadig

Approaches and tools for Metadata Improvement and Guidance.
Apache License 2.0
7 stars 0 forks source link

USGSCSAS metadata schema validation? #36

Closed scgordon closed 8 years ago

scgordon commented 9 years ago

It appears that USGSCSAS is gaming the DataOne system in the same way that ORNLDAAC does, for the same reason. USGSCSAS CSDGM...1998 xml files do not declare a namespace or schema, disallowing DataOne's system from checking the xml against the schema for the dialect they declare. This holds true for their BDP records as well. USGSCSAS has many nonstandard fields that they use. By not sharing the schema they use, it becomes more difficult to understand what they are trying to share.

mbjones commented 9 years ago

We discussed this last week at DataONE, and concluded that USGS (and others) would like DataONE to further relax our enforcement of schemas. So, in the future I'm guessing we can expect even lower adherence to declared schemas. The was USGS omitted the schema declaration here is our current preferred approach for a metadata document that doesn't precisely follow a schema, as all bets are off once people start diverging from the schema. Our quality tests should be very clear about adherence, and try to indicate as best as we are able how people could improve their records to more closely match schemas for consistency.

scgordon commented 9 years ago

I understand that member nodes might want a relaxed schema enforcement. Particularly when they have done so much development on a nonpublicly standardized markup language. I think that perspective is a dangerous one when it comes to ensuring quality metadata, and data packages across a sharing and discovery platform like DataOne.

Perhaps the standard that metadata is considered to come from should be a function of the namespace or schema declaration in the records and not a question that member nodes can supply the answer to. If an organization's metadata omits namespaces and schema references then no standard should be attributed to the collection. I wonder if organizations would be more willing to provide schema validation for their collections if noncompliant organizations got lumped into the "Metadata" metadata section. Are there any member nodes not using a variant of CSDGM that want reduced schema enforcement?

Maybe the direction that organizations should be encouraged to go in is keep an up to date schema or utilize a standard. It might be worthwhile to see if reuse rates (or at least downloads) of data packages with standardized metadata is higher than unstandardized metadata.