Rec. 4: Components of a FAIR data ecosystem

sjDCC commented 6 years ago

The realisation of FAIR data relies on, at minimum, the following essential components: policies, DMPs, identifiers, standards and repositories. There need to be registries cataloguing each component of the ecosystem and automated workflows between them.

Registries need to be developed and implemented for all of the FAIR components and in such a way that they know of each other’s existence and interact. Work should begin by enhancing existing registries for policies, standards and repositories to make these comprehensive, and initiate registries for DMPs and identifiers. Stakeholders: Data services; Standards bodies; Global coordination fora.
By default, the FAIR ecosystem as a whole and individual components should work for humans and for machines. Policies and DMPs should be machine-readable and actionable. Stakeholders: Data services; Global coordination fora; Policymakers.
The infrastructure components that are essential in specific contexts and fields, or for particular parts of research activity, should be clearly defined. Stakeholders: Research communities; Data stewards; Global coordination fora.
Testbeds need to be used to continually evaluate, evolve, and innovate the ecosystem. Stakeholders: Data services; Data stewards.

hollydawnmurray commented 6 years ago

F1000 position: With regards to point 1, see https://fairsharing.org/

ghost commented 6 years ago

4TU.Centre for Research Data position: We do understand the idea and motivation for creating a interactive and machine-actionable eco-system, however currently this feels quite utopian. Point 2 and 4 are achievable by data services. Regarding point one: creating, linking, and maintaining registries for ALL fair components seems quite challenging. Having Policies and DMPs that are machine-readable and actionable seems to be a good idea and we can clearly envision a use-case. However, this will also be difficult to incorporate in reality on university level.

katerbow commented 6 years ago

DFG position: All mentioned components of the FAIR data ecosystem do have their rights in their own and are important aspects of the FAIR-principles. In an ideal world, all of them would be implemented and would work seamlessly in close cooperation. Due to the heterogeneity of the landscape and development of these components, it would be very helpful to provide this recommendation with a prioritisation of the components.

Eefkesmit commented 6 years ago

Contribution on behalf of the International Association of STM Publishers (STM): To help establish a machine-actionable eco-system for FAIR data, STM and STM publishers offer to collaborate on 4 cornerstones in arranging and keeping research data and related publications linked and retrievable. This is of paramount importance in the aim of making research data findable, accessible, interoperable and re-usable. In addition to the right metadata, persistent identifiers, standards, etc, research data also need a narrative to be intelligible and to be understood in the proper context. Therefore, the eco-system needs systematic links between data and publications. We identify the following 4 essential cornerstones to achieve this:

Data Availability policies and statements -- design and implement standardised research data policies for scholarly publications, including Data Availability statements in published articles, preferably with the Research Data Alliance,
Aligning the submission of data and publications -- promote and enable the use of trusted data repositories for datasets supporting publications, in conjunction with submission of manuscripts where appropriate – via recommended repository lists, services to help deposit data alongside the submission of manuscripts, and technological integrations between scholarly infrastructure, eg by means of API-standards.
Arrange for universal linking between datasets and publications, bi-directionally-- support adoption and implementation of a SCHOLIX-framework, see www.scholix.org .
Data Citation standards -- Promote and implement data citation rules and standards according to the recommendations of FORCE11, to provide credit for good data practice.”

Drosophilic commented 6 years ago

There are already a number of FAIR ecosystem components coming online, like FAIRsharing.org (disclaimer: that is the project I work on), and FAIR metric initiatives such as FAIRmetrics.org, and attempts to link such components together, such as by the NIH Data Commons. I assume this recommendation will take these existing resources into account rather than developing something anew?

While these resources are heterogenous, it makes sense to make them compatible and interoperable and ensure they themselves are FAIR.

ScienceEurope commented 6 years ago

Science Europe already has published policies on data management (Framework for discipline-specific RDM Protocols) and is in contact with various scientific communities to promote its uptake. The RDM policies that Science Europe will publish towards the end of 2018 deal with DMPs and repositories, taking into account aspects as identifiers and standards.

ferag commented 6 years ago

In my opinion, communication mechanisms, protocols, and standards should be enabled to support the interaction among the different described components and to allow interoperability and machine-readable features. The double approach ecosystem, human-machine, is very important to be scalable.

RCN2018 commented 6 years ago

• In the FAIR Data Action plan, “FAIR Data Objects” play a central role, but it is left unclear what these objects are and how they will function in the envisaged FAIR data eco-system. Similarly, the report “Prompting an EOSC in practice” mentions “Digital Objects”, which are however not limited to data. In both cases, it is only mentioned that the objects should be identified with PIDs such as Handles and DOIs, but it is otherwise left in the dark what digital object architectures should look like. The future global data fabric will need “smart” digital objects (DOs) which are meaningful, typed entities that can serve as stable anchors in the dynamic universe of scientific and other data. Metadata associated with smart DOs should not only have static descriptive metadata, but also dynamic provenance metadata describing the continuous history of creation and processing, licensing metadata which allow the automatic authorization of authenticated users based on permission levels and smart contracts, types allowing the determination of operations that can be carried out on the data, etc. Europe should therefore as soon as possible set up an action in which global stakeholders work on use cases, definitions and implementations of DOs.

bertocco commented 6 years ago

INAF (astronomy) position: ecosystem components shouldn't be named as actual implementations, we're speaking RECs, not applications.

pkdoorn commented 6 years ago

“By default, the FAIR ecosystem as a whole and individual components should work for humans and for machines.” Ideally and ultimately, yes… but it is not realistic to make this demand universal here and now. Access licenses are complicated whenever restrictions apply, and these are very hard to make actionable for machines. A balance should also be struck on the costs versus the benefits of this.

mromanie commented 6 years ago

ESO position Should the registries cataloging the components be specific to disciplines, or global? The latter seems a tall order (in addition to the complications of making them work for humans and machines).

gtoneill commented 6 years ago

The recommendation title could be clearer and perhaps include or focus on realisation. Data stewards should be added to the figure as they will no doubt play a fundamental role in this realisation (and are rightly mentioned in the points several times). There are, as has been mentioned, already existing initiatives and recommendations for realising FAIR Data. The recommendations should clearly link to such existing precursors and the report should serve to unite and build upon the state-of-the-art.

MSoareses commented 6 years ago

As referenced in my comment to Rec. https://github.com/FAIR-Data-EG/Action-Plan/issues/3 particularly the example on @Scholix and in line with @eefkesmit ‘s comment above publishers must be included as components of the FAIR ecosystem.

npch commented 6 years ago

SSI position:

Data Management Plans should be extended to include other linked research outputs and artefacts and be machine readable / actionable. There should also be greater clarity around other identifiers, including identifiers for people (do ORCIDs currently work when a creator is not directly employed in a research position?), organisations and projects.

aidaturrini commented 6 years ago

I found interesting all comments. Implementation is challenging but it is crucial also when machine and human level must be managed creating Digital Objects.

mark-cox commented 6 years ago

euroCRIS position:

At both institutional and national levels, CRIS/RIM (research information) systems are being used to complement, and in some cases act as, data repositories. As noted in the response to rec. 3, these systems contain rich information regarding the landscape in which research data is generated and published, though interlinked metadata relating to e.g. projects and publications. Increasingly these systems are being made interoperable with each other and with parallel infrastructure e.g. OpenAire, though the use of the CERIF standard. Our belief is that the FAIRness of the EOSC could be taken to another level should it be complemented by an underlying interoperable research information infrastructure, including elements relevant for finding, accessing, interpreting and reusing data. One first step may be using these systems as an early core of interoperable data registries.

etothczifra commented 6 years ago

DARIAH-ERIC position: We strongly support the idea of establishing registries cataloguing each component of the FAIR ecosystem to make these services visible and searchable. To fulfil the need for humanities-dedicated repository selector service and help researchers in humanities to identify suitable research data repositories for the deposit of their research data, a prototype model, the Data Deposit Recommendation Service (DDRS) has been developed within the Humanities at Scale project, an offspring of the humanities specific research infrastructure DARIAH.

On the other hand, however, fulfilling the requirement of machine actionable DMPs may not always be feasible in the hybrid disciplinary environment of humanities where automated and manual workflows and practices are still co-present. For the same reason, a fully automated workflow between each component of the ecosystem might be quite challenging.

FAIR-Data-EG / Action-Plan

Rec. 4: Components of a FAIR data ecosystem #4