What would be the essential criteria to be FAIR?

rwwh commented 5 years ago

The premise of this question is wrong. I've heard people say

FAIR is not a goal, it is a journey

I am very much against putting any binary "fence" that makes a distinction between not FAIR enough and FAIR enough. This has the opposite effect of what we want to reach: we don't help challenged fields become more FAIR ("we'll never make it to the edge, no matter how hard we try"), and/or we let the best become content with themselves ("we're already good enough").

FAIRness is something to continue striving for. Based on reasonable options, we put enough effort in to make the data more FAIR than we would have been doing if we would not pay attention. The low hanging fruits. Over time, tools and standards will evolve, and we will be able to reap the FAIR fruits that are higher up in the tree of data re-use.

keithjeffery commented 5 years ago

@rwwh Rob, I agree with you FAIR is a journey and that different assets will be at different points along that journey. However, the problem is that the pathway is in fact multiple parallel paths (one per criterion) with some requiring the voyager to be ahead of her position on other paths (i.e. at least some F before A and so on).

All - I have been concerned about whether - to be FAIR - an asset should be F,A,I,R autonomically (or whether by human manual action is enough).

I was doing some background checking and came across this page https://www.force11.org/fairprinciples in particular 1.2 and 3.1 which seems to me to be explicit that for FAIR to work (and therefore for an object to be assessed as FAIR) it must be machine-actionable

What do WG members think of this? Should we follow Force11, take account of Force11 as one interpretation or just 'do our own thing'?

rwwh commented 4 years ago

More recent discussions definitely prefer "machine-actionabilty" wherever a preference must be given. A machine that understands a resource can always make it more human readable, whereas the opposite is not scaleable.

However, my personal opinion is that complete machine-actionability is not always mandatory.

An example of how this can be discipline specific is in comparing the forest/trees of data resources in biology with high-energy physics. Anyone in physics interested in electron/positron collisions knows about the LEP at CERN. Therefore, findability is sufficiently served by a human typing "LEP" in google. In contrast, no biologist knows all the possibly relevant databases out of thousands of available resources (even in a very limited field, like "late stage prostate cancer"), and machine crawling and reasoning becomes very necessary, which is only possible if the data are machine-FAIR.

makxdekkers commented 4 years ago

@rwwh @keithjeffery Given the way Rob describes it, would it be fair to say that the case of CERN is an exception, or at least one of a smaller set of cases? If the group feels that machine-FAIRness is required in the general case, would it be reasonable to encourage the organisations that play such a role in the exceptional cases, like CERN, also to make their data and metadata machine-FAIR?

keithjeffery commented 4 years ago

@makxdekkers @rwwh Max, Rob - while I understand Rob's point, this works only within the particle physics community (and anyway they would need to know which of the 4 major experiments (ALEPH, DELPHI, OPAL, L3) using LEP were relevant to their research). More likely they would wish to access assets of th more powerful and more recent LHC and experiments ATLAS, CMS, ALICE, LHCb.

What if an astronomer wants to investigate particle interactions at CERN in connection with her researches on star formation? I agree with Rob's point that the trend is towards more autonomic (machine actionable) FAIR assets.

rwwh commented 4 years ago

I think we agree. Summarizing my example I think that this is another indication that things are never black and white. This is a good reason to avoid giving anyone the impression that the maturity model can give any black/white answer. A good reason e.g. to try and avoid the word "mandatory". I am a fan of "comply or explain" models. Maybe something like that could be used in the FAIR maturity model too to replace the black/white words.

keithjeffery commented 4 years ago

@rwwh Rob - we usually agree if coming at the problem from different directions! Comply or explain is a good model, but an autonomic system will not understand the explanation unless it is very strictly coded. How about generally (doubless there wil be exceptions) using mandatory for the autonomic options in the criteria and recommended for the manual options thus allowing human reading of computer-stored explanatory text?

makxdekkers commented 4 years ago

@keithjeffery When you say mandatory but allow exceptions, isn't that a contradiction? For that behaviour, we already have 'recommended', as per RFC2119: there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. This is close to the meaning of 'comply or explain' that @rwwh suggests.

@rwwh Would your suggestion, if I understand it correctly, also mean that things like persistent identifiers, provision of metadata, resolution to digital object, indication of reuse licence should be only 'comply or explain'?

rwwh commented 4 years ago

@makxdekkers I think that each application of "comply or explain" would require some individual attention. I won't be easily accepting an explanation for the lack of a persistent identifier (maybe I am just not creative enough to think of a situation where that could apply). But elsewhere we've had discussions about "what to do if there is no applicable FAIR ontology yet", and there an explanation would clearly suffice. An explanation for me would be acceptable if it can show that the indicator would require unreasonable effort from the project; i.e. they can show that it is someone else's responsibility.

keithjeffery commented 4 years ago

@makxdekkers Makx - to clarify, when I said exceptions I meant some criteria may not be suitable for mandatory even in their 'autonomic' variant, not that for an individual indicator could be mandaory with exceptions. Apologies for lack of clarity. Thus we would hae to go through the indicators and for autonomic ones decide if mandatory or recommended.

makxdekkers commented 4 years ago

@rwwh What we have been trying to do with the 'prioritisation' of the indicators is distinguishing these cases where no-one would easily accept non-compliance versus cases where there may be reasonable arguments for non-compliance. Indeed, there are good arguments for non-compliance with indicators like the one about FAIR-compliant ontologies and therefore those would not be 'mandatory' (or 'essential' or 'we-would-not-easily-accept-non-compliance'). Any indicators that are classified 'mandatory' could be handled by @keithjeffery's 'autonomic system'. All others will require some level of human consideration to look at the explanation of non-compliance. That makes 'autonomic' evaluation more difficult.

makxdekkers commented 4 years ago

@keithjeffery Yes, going through the indicators and fixing their 'priorities' is what we are hoping to achieve in the next couple of weeks. Maybe we should set up polls for each of the indicators so that WG members can choose between mandatory, recommended and optional for each of them?

rwwh commented 4 years ago

I am really afraid that if we offer any system that can give an automatic yes/no compliant statement, achieving that minimal level of compliance will become the de-facto goal.

makxdekkers commented 4 years ago

@rwwh Rob, the indicators that allow autonomic evaluation would indeed provide a minimal level of compliance, but whether or not that becomes the goal is a decision that needs to be taken on the political level. Our proposal on the 'scoring' would declare that satisfying only the minimum set of indicators (the ones that would be mandatory) is a 'basic' FAIRness level, and satisfying more recommended and optional indicators would move up the ladder of levels. A particular community could decide that the basic level is enough, but others could decide they need higher levels, for example by making indicators that are recommended for the general case, mandatory for the special case of that community. The idea is that the set of indicators allows various ways of evaluating that fits the needs for a particular community, but includes a minimum set that is applicable across communities.

rd-alliance / FAIR-data-maturity-model-WG

What would be the essential criteria to be FAIR? #6