NCEAS / metacatui

MetacatUI: A client-side web interface for DataONE data repositories
https://nceas.github.io/metacatui
Apache License 2.0
42 stars 28 forks source link

submission form should automatically fill in `publisher` information #1091

Open jeanetteclark opened 5 years ago

jeanetteclark commented 5 years ago

When a dataset is submitted to member node, the publisher information should automatically be inserted into the EML document on submission. This will make our metadata more Accessible according to the FAIR metadata quality suite.

We should include an organization identifier in the userId field. "ROR or GRID or WIKIDATA would be good" - @mbjones

laurenwalker commented 5 years ago

What kind of publisher metadata can we automatically insert? Doesn't that need to be provided by the user? Or should we automatically set it to the Member Node name?

jeanetteclark commented 5 years ago

No, not provided by the user because the publisher is whatever member node they are publishing to.

It goes in eml/dataset/publisher as a responsibleParty, I imagine just with Arctic Data Center as the organization, and one of the above identifiers (although I cannot find what our organization identifier is in any of those systems for the life of me). @mbjones may be able to advise

mbjones commented 5 years ago

@laurenwalker ADC and other repositories may not be registered in those systems, but we need to get them there. I asked @gothub a few months ago to look into getting that rolling, and we ran into some timing issues with ROR. But GRID and Wikidata should be possible. Once they are in GRID, they should automatically end up in ROR.

jeanetteclark commented 4 years ago

our wikidata identifier is: Q77285095

jeanetteclark commented 4 years ago

I noted today that users can enter their own publisher information. It seems to me that they shouldn't be able to do this - the field is usually misinterpreted anyway. So as part of this issue I think we should consider removing that ability from the UI

mbjones commented 4 years ago

Agreed, or at least make it a config option as to whether it shows up.

robyngit commented 2 years ago

Here is a summary of the tasks required for this issue, as I understand it:

option 1: we add a new config option that is specifically used to provide the Publisher information, e.g.

/**
 * Information about the repository that will be automatically inserted into
 * new EML metadata documents as the Publisher. This object can set any of the
 * fields that are available in the Responsible Party EML type, see
 * {@link https://github.com/NCEAS/eml/blob/main/img/eml-party.png},
 * @type {object}
 */
publisher: {
  organizationName: 'Arctic Data Center',
  userId: 'Q77285095',
  onlineUrl: 'https://arcticdata.io'
}

option 2: we could pull this information from other configuration options. organizationName = repositoryName, onlineUrl = baseUrl. We would just need to add a repositoryId for the userId, and a automaticallyFillPublisher (or similar) boolean option.

Questions

  1. Do we add this information: a. to new EML documents only? b. also to existing EML documents that have no publisher when they are edited?
  2. When the Publisher information is pre-filled, should we display this in the editor but make the fields un-editable? Or just keep it hidden behind the scenes?

@mbjones and @jeanetteclark, what do you think of this plan and do you have any feedback on these two questions? Thanks!

dvirlar2 commented 1 year ago

To answer your questions @robyngit:

1. Do we add this information:
   a. to new EML documents only?
   b. also to existing EML documents that have no publisher when they are edited?

2. When the Publisher information is pre-filled, should we display this in the editor but make the fields un-editable? Or just keep it hidden behind the scenes?

Answers: 1a & 1b. We add publisher information to all new EML documents, as well as to existing EML documents that did not have that info listed when they are edited. So if someone were to edit a published dataset, we're still checking that the publisher information is there when we curate the new changes. If it's not, we'll add it.

  1. I think keeping that info hidden behind the scenes is the better route here. I can see us getting questions about why we include a non-editable response in the editor 😅 I think it would make sense(!), but I'd like to prevent those questions if possible 🙂
mbjones commented 11 months ago

Here's our current metadata entry for the ADC, listing our various identifiers and other info from our schema.org entry on our home page:

ADC schema.org entry ```json { "@context": [ "https://schema.org/" ], "@type": [ "Service", "Organization", "ResearchProject" ], "@id": "https://arcticdata.io", "name": "Arctic Data Center", "legalName": "Arctic Data Center", "alternateName": "ADC", "logo": "https://arcticdata.io/wp-content/themes/aurora/library/images/logo_.png", "url": "https://arcticdata.io", "description": "The Arctic Data Center is the primary data and software repository for the Arctic section of NSF Polar Programs.", "identifier": [ { "@type": "PropertyValue", "name": "ROR:055hrh286", "propertyID": "https://registry.identifiers.org/registry/ror", "value": "ror:055hrh286", "url": "https://ror.org/055hrh286" }, { "@type": "PropertyValue", "name": "Re3data DOI: 10.17616/R37P98", "propertyID": "https://registry.identifiers.org/registry/doi", "value": "doi:10.17616/R37P98", "url": "https://doi.org/10.17616/R37P98" }, { "@type": "PropertyValue", "name": "wikidata:Q77285095", "propertyID": "https://registry.identifiers.org/registry/wikidata", "value": "wikidata:Q77285095", "url": "https://www.wikidata.org/wiki/Q77285095" }, { "@type": "PropertyValue", "name": "grid:grid.507882.0", "propertyID": "https://registry.identifiers.org/registry/grid", "value": "grid:grid.507882.0", "url": "https://www.grid.ac/institutes/grid.507882.0" } ], "sameAs": [ "https://ror.org/055hrh286", "https://www.grid.ac/institutes/grid.507882.0", "https://www.wikidata.org/wiki/Q77285095", "https://www.re3data.org/repository/r3d100011973", "http://doi.org/10.17616/R37P98", "urn:node:ARCTIC" ], "category": [ "Arctic Research" ], "provider": { "@id": "https://arcticdata.io" }, "contactPoint": { "@type": "ContactPoint", "name": "Support", "email": "support@arcticdata.io", "url": "https://arcticdata.io/support/", "contactType": "customer support" }, "foundingDate": "2016-02-01", "funder": { "@type": "Organization", "@id": "https://doi.org/10.13039/100000087", "legalName": "Office of Polar Programs", "alternateName": "OPP", "url": "https://www.nsf.gov/div/index.jsp?div=OPP", "identifier": { "@type": "PropertyValue", "propertyID": "https://registry.identifiers.org/registry/doi", "value": "doi:10.13039/100000087", "url": "https://doi.org/10.13039/100000087" }, "parentOrganization": { "@type": "Organization", "@id": "https://doi.org/10.13039/100000085", "legalName": "Directorate for Geosciences", "alternateName": "NSF-GEO", "url": "http://www.nsf.gov", "identifier": { "@type": "PropertyValue", "propertyID": "https://registry.identifiers.org/registry/doi", "value": "10.13039/100000085", "url": "https://doi.org/10.13039/100000085" }, "parentOrganization": { "@type": "Organization", "@id": "https://doi.org/10.13039/100000001", "legalName": "National Science Foundation", "alternateName": "NSF", "url": "http://www.nsf.gov", "identifier": { "@type": "PropertyValue", "propertyID": "https://registry.identifiers.org/registry/doi", "value": "10.13039/100000001", "url": "https://doi.org/10.13039/100000001" } } } }, "hasOfferCatalog": { "@type": "OfferCatalog", "name": "Arctic Data Center Data Catalog", "itemListElement": [ { "@type": "DataCatalog", "@id": "https://arcticdata.io/catalog/data", "name": "Arctic Data Center Catalog", "audience": { "@type": "Audience", "audienceType": "public", "name": "General Public" } } ] }, "address": { "@type": "PostalAddress", "streetAddress": "1021 Anacapa Street", "addressLocality": "Santa Barbara", "addressRegion": "CA", "postalCode": "93101", "addressCountry": "US" }, "parentOrganization": { "@type": "Organization", "@id": "https://ror.org/0146z4r19", "legalName": "National Center for Ecological Analysis and Synthesis", "alternateName": "NCEAS", "url": "http://nceas.ucsb.edu", "identifier": { "@type": "PropertyValue", "propertyID": "https://registry.identifiers.org/registry/ror", "value": "ror:0146z4r19", "url": "https://ror.org/0146z4r19" }, "parentOrganization": { "@type": "Organization", "@id": "https://ror.org/02t274463", "legalName": "University of California, Santa Barbara", "alternateName": "UCSB", "url": "http://ucsb.edu", "identifier": { "@type": "PropertyValue", "propertyID": "https://registry.identifiers.org/registry/ror", "value": "ror:02t274463", "url": "https://ror.org/02t274463" } } }, "inLanguage": "en-US", "addressCountry": "US", "license": [ "http://spdx.org/licenses/CC0-1.0", "https://spdx.org/licenses/CC-BY-4.0" ], "credentialCategory": "CoreTrustSeal", "termsOfService": [ "http://spdx.org/licenses/CC0-1.0", "https://spdx.org/licenses/CC-BY-4.0" ], "ex:persistentIdentifiers": [ "https://registry.identifiers.org/registry/doi", "https://registry.identifiers.org/registry/orcid", "https://registry.identifiers.org/registry/ror", "https://registry.identifiers.org/registry/rrid", "https://registry.identifiers.org/registry/d1id", "https://registry.identifiers.org/registry/ark" ], "ex:machineInteroperability": [ "DataONE", "OAI-PMH", "DataCite", "REST", "SPARQL" ], "ex:metadata": [ "EML", "ISO-19115", "DDI", "Dublin Core", "FGDC CSDGM", "METS", "DataCite", "OAI-ORE", "other" ], "ex:curation": "https://arcticdata.io/submit/", "ex:preservationPolicy": "https://arcticdata.io/preservation/", "ex:termsOfAccess": [ "http://spdx.org/licenses/CC0-1.0", "https://spdx.org/licenses/CC-BY-4.0" ] } ```

I'm guessing the ROR is the best identifier to use these days.