carpentries-incubator / fair-bio-practice

FAIR in (biological) practice
https://carpentries-incubator.github.io/fair-bio-practice/
Other
8 stars 12 forks source link

Explain difference between FAIR data/science vs open data/science #14

Open JaroCamphuijsen opened 3 years ago

JaroCamphuijsen commented 3 years ago

About https://carpentries-incubator.github.io/fair-bio-practice/01-wellcome/index.html

From my experience people often confuse FAIR data and science with open data and science. It would be good to clarify this from the start in the introduction and go into the details after that in the dedicated sections. Also an exercise to clarify this difference would help. We can use some inspiration and materials from the lesson on "FAIR data for climate sciences" that I helped developing: https://escience-academy.github.io/Lesson-FAIR-Data-Climate/introduction/index.html

I think that lesson would be a good source of material, also for other episodes.

tzielins commented 3 years ago

Thank you very much for pointing to "FAIR data for climate sciences", we definitely can use it as the inspiration, especially the exercises.

As for FAIR vs open data ... we still have not decided how much focus to give to the distinction. Nowadays FAIR is becoming synonym for data sharing and reuse, at it is more catching and does not frighten by implying straight away openness. less frightening than "open". Open data without FAIR is of not much use from practical perspective, at the same time reuse in fair component implies sharing thus openness.

In the context of biology, FAIR as machine-actionability is currently completely unachievable. There is lack of standards for metadata or even data formats. At the same time, due to the lack of suitable tooling, using structural metadata and ontologies is too much of a burden for biologist to adopt it in practice.

Uploading data as zip file to zenodo even with a very detailed readme file is not what the FAIR guidelines was about. But for many experimental biological data it is all what can be done at the moment.

But framing of the whole curse has not been decided yet so this issue remains open. On one hand we want it to be practcial with hands on experience on tools on the other it should ilustrate the journey towards being FAIR and general openness.

JaroCamphuijsen commented 3 years ago

Thank you very much for pointing to "FAIR data for climate sciences", we definitely can use it as the inspiration, especially the exercises.

Yes I think so too, never try to reinvent the wheel.

As for FAIR vs open data ... we still have not decided how much focus to give to the distinction. Nowadays FAIR is becoming synonym for data sharing and reuse, at it is more catching and does not frighten by implying straight away openness. less frightening than "open". Open data without FAIR is of not much use from practical perspective, at the same time reuse in fair component implies sharing thus openness.

My experience in life science and biology is that there is actually a lot of data produced and used which is not open. I'm mainly working in the field of cell culture biology and tissue engineering. This field also includes many commercial parties like pharmaceutical companies and startups. Therefore I think it is actually very useful to make the distinction between FAIR and open. It would be very good to try get everyone towards making their data FAIR, which is a win-win situation for everyone. In academic research we additionally also like to be open. I think making clear that data can be open without being FAIR (it is available somewhere on the web for free but without proper metadata, not adhering to standards, etc.) and vice versa (it is well documented, has a DOI, etc. but to access it you need to pay a certain fee or be a member of some group) and explaining the difference can help remove the fear for FAIR and/or open access, because they are truly very different and have their own values.

In the context of biology, FAIR as machine-actionability is currently completely unachievable. There is lack of standards for metadata or even data formats. At the same time, due to the lack of suitable tooling, using structural metadata and ontologies is too much of a burden for biologist to adopt it in practice.

As I think this is certainly true for some fields biology, I think for others there are being developed standards. Also like in other fields, archiving data in a FAIR way with proper metadata etc is indeed a matter of chicken and egg. I think the climate science is an example of a field that is very much advanced in this FAIRification process and biology (again certain fields more than others) is still in the process or even starting. There are even some "Implementation Networks" currently active in several subfields of biology: https://www.go-fair.org/implementation-networks/overview/

Uploading data as zip file to zenodo even with a very detailed readme file is not what the FAIR guidelines was about. But for many experimental biological data it is all what can be done at the moment.

FAIRis indeed not meant as a list of which you can "pick the things you like", however FAIRification of a field is something which is done incrementally. And while it is still in progress, something is always better than nothing!

But framing of the whole curse has not been decided yet so this issue remains open. On one hand we want it to be practical with hands on experience on tools on the other it should illustrate the journey towards being FAIR and general openness.

I think this is exactly what we also wanted to achieve with the FAIR data for climate sciences lesson, therefore we followed the structure and practical tips of this awesome Danish website: https://howtofair.dk/. It is more practical and easily digestible than the actual FAIR principles on go-fair.org