pangaea-data-publisher / sensor-vocab

0 stars 0 forks source link

Required information (metadata properties) to verify if an institution exists or existed #4

Open huberrob opened 1 year ago

huberrob commented 1 year ago

Along with sensor/instrument vocabulary we will most probably also need some information about institutions/organisations which are not included in existing vocabularies. In particular we need a short list of metadata properties which need to exist in order to verify if the given organisation/institution exists or ever existed.To e.g. indicate a manufacturer.

Proposal for metadata properties:

Mandatory: Name, Website (if any), Country, Identifier (if any) Recommended: Acronym Optional: Address

Preferably an existing vocabulary and associated identifiers are used. Such as isni.org or ror.org or the

nanselm commented 1 year ago

just a reference with the ROR identifier is preferred, otherwise we start a second project.... ^^

https://en.wikipedia.org/wiki/Research_Organization_Registry

huberrob commented 1 year ago

Maybe we should follow this sequence:

if a ROR exists -> take the ROR otherwise: if a ISNI exists -> take the ISNI otherwise: if a wikidata entry exists -> take the wikidata ID otherwise: create a wikidata entry and take this ID (and hope this will sometimes and as a ROR)

nanselm commented 1 year ago

it is not our business to create entries for institutions at wikidata. if no ror/isni/wikidata exists, we should stop (==making it a dependency for find its way in 'our' catalogue)

huberrob commented 1 year ago

Ok, so then we leave it with name, country, website in case no entry exists in ROR. One other list or organisations which is relevant for marine things is here: http://vocab.nerc.ac.uk/collection/B75/current/

So we have:

Mandatory:

Recommended:

Optional:

dkottmeier commented 10 months ago

We also need the information from when to when a company/organisation existed under the specified name. Company names change regularly due to selling etc. Ideally, the manufacturer's name in the metadata should be consistent with the company's name when the instrument was manufactured, i.e. if a manufacturer called Meyer was founded in 1900 and produced an instrument 1999; the manufacturer's name in the instrument metadata should be Meyer. If this company Meyer was sold to Mueller in 2004, the same instrument manufactured in 2005 should have the manufacturer's name Meyer as metadata. In many cases, it will be unknown when exactly an instrument was manufactured, in this case the name should be equal to the name written on the instrument.

Many organizations have hierarchically structured organizational units. In this case, rules should exist for which unit is represented in the metadata. If companies have subsidiaries, I would suggest selecting the finest level of granularity available (i.e. the sub-company name). For research institutions, I would recommend following ROR and choosing the parent organization.

For some companies, alternative names exist, e.g. "Thermo Fisher Scientific" is often abbreviated "Thermo Fisher" or could also be written "Thermo Fisher Scientific Inc.": Therefore, we should allow alternative labels, especially if no identifier exist.

In addition to that, I wonder if it would be better if institutions/organisations are no free text field's but stored as independent "organisation" terms that can be added to the instrument terms, e.g. from a drop down field. That would lead to better harmonization, i.e. all instruments deriving from the same company would carry the exact same metadata. When creating the "organisation" terms, metadata from from other persistent identifiers such as ROR could potentially be harvested.