FAIRsFAIR / FAIRSemantics

MIT License
7 stars 1 forks source link

P-Rec. 1: Globally Unique, Persistent and Resolvable Identifiers must be used for Semantic Artefacts, their content (terms/ concepts/ classes and relations), and their versions. (D2.5) #28

Open GCoen1 opened 3 years ago

GCoen1 commented 3 years ago

Semantic artefacts are typically structured text files. They are de facto digital objects and should be unambiguously identified by globally unique, persistent and resolvable identifiers (GUPRI). In the context of a web of FAIR data, these identifiers should be resolvable and support the retrieval of both the semantic artefact itself and also its metadata (see Rec. 2 regarding metadata). As shown in fig. 1, semantic artefacts are composite digital objects requiring at least three levels of identifiers: one for the semantic artefact itself, one for its content and one for the metadata (including both the global metadata and the metadata associated with the content). The latter is described in the following recommendation (Rec. 2). Finally, semantic artefacts are living digital objects by nature, evolving over time. Each version of a semantic artefacts should be uniquely identifiable, allowing access to the latest version by default but also providing access to previous versions in use in existing information systems.

As discussed in the introduction, these identifiers should apply to the semantic artefact but also to its content. Indeed, semantic artefacts can be considered as collections of concepts and relations (datasets). Therefore, in this context, each element/ component of the semantic artefact should also have an associated GUPRI i.e. each term/ concept/ class

Finally, a unified identifier schema should be used to identify each version of semantic artefact. This can be done using versioned URI as proposed by OBO Foundry. Using GUPRI for the different version allows information systems to retrieve automatically the latest version and older versions of the semantic artefact.

This recommendation emphasizes the need for reliable and persistent identification systems without any technical constraints.

alko-k commented 3 years ago

@GCoen1 would you mind adding a new label version2 and mark up all new recommendations with that label please?

GCoen1 commented 3 years ago

@alko-k just added them there!

alko-k commented 3 years ago

Perfect. They look good. Thanks @GCoen1 :-)

rob-metalinkage commented 3 years ago

This issue is not self contained - could we have links to the document and explicit examples - we could all go and chase down the OBO foundry recommendation - but a link and/or inline example would make more sense.

That said - I'd make the general comment that there is no such things as "its metadata" - there will be a range of metadata for different purposes, most objects have properties that are metadata as well as other "metadata objects" that make statements about the object directly, or indirectly by making statements about the containing object (dataset, ontology, ConceptScheme.. etc). In addition, there are many flavours of metadata for different purposes, and to support interoperability with different domains. For different purposes consider PROV-O vs DCAT as metadata. One domain may use a profile of DCAT (e.g. GeoDCAT-AP for Iceland) - and another may require some profile of Dublin Core, or ISO 19139 or...

So the trick here is not to create "yet another metadata schema" but instead focus on canonical methods to discover what metadata schemas (and implementation profiles) are on offer and how to access these, and to define interoperability domains based on the profiles that may be offered.

This may be done in at least three ways, all with challenges: 1) Forcing resolution of a specific MIME type to a canonical metadata form that is a launching page to the other available forms describing what they are 2) Having a canonical access API that allows either a default with identification of form or request of specific forms and the ability to report what is available as per option 1. 3) Force the provision of a service that can be interrogated about a given resource to discover the available forms of metadata (and other forms of the object)

If another way can be imagined then we should have it explained and explore its feasibility and make a decision about what we can recommend and implement.

alko-k commented 3 years ago

P-Rec. 8: Human and machine-readable persistence policies for semantic artefacts metadata and data must be published. (D2.5) can be used here to ensure sustainability instead of a GUPRI