carpentries-incubator / fair-bio-practice

FAIR in (biological) practice
https://carpentries-incubator.github.io/fair-bio-practice/
Other
8 stars 12 forks source link

Metadata intro is not an intro #16

Closed tzielins closed 3 years ago

tzielins commented 3 years ago

Me not like it :)

Producing metadata is not an intro... if we start producing metadata now, what are we going to do in another part. Or why now we have "silly excercise" now if then we have a better one.

Let's remember we are going to have a proper episode for producing metadata.

At the same time, there are important concepts as annotating with PermId, MIAMIs standards.

Obj1: Know what is metadata Obj2: Know how to provide metadata Obj3: How FAIR applies also to metadata

Obj1 & Obj2. We are actually missing example of metadata. Remember that at workshop students do not look at the text, so we need an examples to look at while intstructor explains the concept of metadata and it types.

I propose to have a readme like text (half page) that describes a data file and a second example table data with some embeded metadata

But lets have them buffy, half page readme so it has details. Same for data table. It should be obvious it takes time to produce.

Excercies:

Obj3. The dobbleganger is funny, but, making a record with just orcid ....

What about showing publication record that uses orcid, for example https://wellcomeopenresearch.org/articles/5-96/v2 and ask to click on authors orid which takes to their orcid pages and their own work not doblegangers.

If we have time we could change this example to another wellcome paperwith some common name like John Smith or an Asian one as they have real problem of having a lot of dopplegangers (a quick search did not give me nice example).

Metadata standards needs more attention. I used before https://fairsharing.org/standards/ to find some, in the linked DCC I could not find even MIAME.

Maybe excercie to find a two specific standards? Or what issues the standards help to address.

We are not going to cover standards in real follow up episodes as we are "type" agnostic also those standards are pain in reality. So that is the only episode we are going to talk about standards.

tzielins commented 3 years ago

Trying to find some metadata standard that is possible to show and explain quickly but no luck. So maybe indeed we only can tell that they exists. But in that case at least we should list the famus ones like MIAME, SBML, SBOL

tzielins commented 3 years ago

For metadata/readme example, maybe Edward W papers have some. When I browsed his paper searching for missing software I remember seeing nice data access seccions so maybe it has also nice readme for those or nicely formated data files.

aromanowski commented 3 years ago

You certainly had a different idea than I had. I went for something simple thinking about the fact that then it is the student's job to look for the one they would use for their data, and also taking into account the graph we saw during the training course:

So then, in the more advanced metadata episode, we would look at something more difficult.

Compare the episode I created with the one from FAIR for climate science: https://escience-academy.github.io/Lesson-FAIR-Data-Climate/metadata/index.html They also made it as simple as possible.

I can certainly incorporate the changes you suggested but am worried that might make it a longer session. I will give it a go and then let's see how it works out!

aromanowski commented 3 years ago

How about for this bit:

I propose to have a readme like text (half page) that describes a data file
and a second example table data with some embeded metadata

But lets have them buffy, half page readme so it has details. Same for data table. It should be obvious it takes time to produce.

Excercies:

Identify the 3 types of the metadata in the examples
think which part of metadata in the examples can be treated as data or reverse (e.g. if there two strains names strain is probably a data not a metadata)

we use one of our Omero images with the associated metadata?

Like: image

Metadata:

Image ID:   3485
Owner:  Maria Eugenia Goya

Acquisition Date:   2018-12-12 17:53:55
Import Date:    2020-04-30 22:38:59
Dimensions (XY):    1344 x 1024
Pixels Type:    uint16
Pixels Size (XYZ) (µm): 
0.16 x 0.16 x 1.00
Z-sections/Timepoints:  56 x 1
Channels:   
TL DIC, TagYFP
ROI Count:  0

Tags: time course; day 10; adults; food switching; E. coli OP50; NL5901; C. elegans

Description:
Each image contains ~30 (or more) Z-sections, 1 µmeters apart.
The TagYFP channel is used to follow the alpha-synuclein particles.
The TL DIC channel is used to image the whole nematode head.
This images were used to construct Figure 2B of the Cell Reports paper (https://doi.org/10.1016/j.celrep.2019.12.078).
For full protocol description on image acquisition see https://doi.org/10.1016/j.celrep.2019.12.078.
Boehmin commented 3 years ago

Andrew I really like this particular example with metadata from microscope images, very relatable! If I can be honest, the Data Climate metadata lesson is not very good and does not actually give good examples of WHAT meta data is in the context of science. Of course, meta data is data about data, but every primary school child could google that by now. We need to consider that people come in with different background knowledge, maybe none at all, and some might have a little knowledge about web-design where they associate metadata with data about the website. Previous knowledge about other subjects needs to be unlearned, or reassociated to what metadata means in this context - as mentioned in this morning's meeting. E.g. some people might think metadata is a file that needs to be attached to a paper which is called metadata.txt and wonder what information to put in there.

I like where the current lesson is going at the moment!