NCEAS / metacatui

MetacatUI: A client-side web interface for DataONE data repositories
https://nceas.github.io/metacatui
Apache License 2.0
42 stars 27 forks source link

System is assigned incorrectly in the root element #1371

Open jeanetteclark opened 4 years ago

jeanetteclark commented 4 years ago

System is automatically filled in as "knb" in the root of the document

For example, from this test document

<eml:eml xmlns:eml="eml://ecoinformatics.org/eml-2.1.1" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.1"
    xsi:schemaLocation="eml://ecoinformatics.org/eml-2.1.1 eml.xsd"
    packageId="urn:uuid:374a77fd-d350-448f-849e-113d34bcc4f8"
    system="knb">

To Reproduce

  1. Create a test dataset on test.arcticdata.io
  2. Examine root element

Expected behavior The system is supposed to indicate the scope over which the identifier is unique. Here is a list of identifier schemes and their systems:

Desktop (please complete the following information):

amoeba commented 4 years ago

Can confirm a new document has a system of "knb" on test.arctic. This is controlled by EML211.createXML()

createXML: function() {
  ...
  eml.attr("system", "knb"); // We could make this configurable at some point
  ...

See: https://github.com/NCEAS/metacatui/blob/master/src/js/models/metadata/eml211/EML211.js#L1903

Can also confirm the editor doesn't mess with system when it's set on a document we're updating so, depending on what's changed here, we may want to tweak that too.

It strikes me that we haven't been consistent on how we set system in the past, system is a bit of a historical vestige, and I'm not sure if we have a standard approach right now. @laurenwalker, do you remember if we have one here?

laurenwalker commented 4 years ago

No, I'm not aware of a standard approach. According to the EML spec, a URL to the data management system is suggested: https://eml.ecoinformatics.org/schema/eml-resource_xsd.html#SystemType

So I think we could use the repository base URL. (We could use the AppModel baseUrl attribute)

mbjones commented 4 years ago

"knb" has been our historical system since when the 'Network' in the KNB was the emphasis (i.e., as the precursor to DataONE). Its also why all of hte LTER identifiers have 'knb' in them. We've recognized for a long time that a bare 'knb' word is insufficient to scope things well, and that a URI would be better. There's no agreed upon vocabulary for system designators, and in reality the system attribute has little meaning. @jeanetteclark and I discussed changing it to something a bit more meaningful, either a DataONE URI for dataone-scoped identifiers (e.g., in which resolve() would work), or possibly as a DOI uri where that resolve service would work.

amoeba commented 4 years ago

I think I'd put some votes (if I get multiple) for any of

  1. @laurenwalker's repo-specific system value based on baseUrl
  2. <eml packageId="foo.1.1" system="dataone" ... />...</eml>
  3. <eml packageId="foo.1.1" system="dataone.org" ... />...</eml>
  4. <eml packageId="foo.1.1" system="https://dataone.org/datasets/" ... />...</eml>
mbjones commented 4 years ago

I like number 4 as it both provides a URI that leads to dataset collections, and combined with the identifier should create a resolvable URI. For Metacat systems that are not (and will not be( connected to DataONE as members, then number 1 would make sense.