HumanCellAtlas / metadata-schema

This repo is for the metadata schemas associated with the HCA
Apache License 2.0
65 stars 32 forks source link

Prepare agenda and presentation for the first community Metadata call #1154

Closed zperova closed 4 years ago

zperova commented 4 years ago

Prepare the presentation for the first community Metadata call on 18 November 2019. Agenda for the call will be sent out on 4 November 2019.

Acceptance criteria:

zperova commented 4 years ago

@mhaluska the title and paragraph are here: https://docs.google.com/document/d/1JvsY0OiS-Jf9ZhZSWchMe0CsReSZ3ikazUWtJQvOSkY/edit

mhalushka commented 4 years ago

Hi. I added the information. But can we chat about this some? I don't know if what's in the paragraph is the only place this should be headed. The bigger picture is how do we engage the wider community (particularly tissue harvesters) to get all of the useful metadata fields collected. Then the community can decide on importance/obtainability of the fields. I want to make sure that topic gets covered as well. What you wrote seems to be going deeper into the specifics of the terms, which in my opinion, is a separate issue at this time. Thanks!

zperova commented 4 years ago

@mhalushka yes we should chat about it, this is the first draft, not the final product. I will send an email to decide on a suitable time.

lauraclarke commented 4 years ago

@mhalushka what is your goal for the outcome of this first meeting?

mhalushka commented 4 years ago

@lauraclarke I'd like to talk about the wider goal of having the community help select and rank (for importance) the metadata they want/can get. And my goal would be to discuss the current communication gap between this group and the wider community and how to bridge that. Thanks!

zperova commented 4 years ago

Call with @mhalushka scheduled for Friday 1 November to discuss outcomes of the first meeting

mhalushka commented 4 years ago

@zperova I have revised the google doc with my ideas based on our conversation. Feel free to continue editing.

zperova commented 4 years ago

thanks @mhalushka To summarize discussed plan:

mhalushka commented 4 years ago

@zperova Yes. I and I would add at the end:

lauraclarke commented 4 years ago

A couple of questions, are you expecting members of the community to turn up with a list on the call? if not how are you intending to collect the list?

How are you going to reach out to the community either before the call or at another time?

What is your plan for considering how useful the fields are in terms of downstream consumers?

Do you plan to collect use cases as well as fields?

mhalushka commented 4 years ago

@lauraclarke

  1. Not expecting the community to have a list at the time of the call. They can send a list to Zina or me.
  2. We can also post something on #general on HCA slack after the call.
  3. We need to set up a voting system for the fields. Survey Monkey or similar can work. Two votes on each field.... Importance (A, B or C) and Ease of Obtaining (A, B or C).
  4. I'm not sure what you mean by use cases here. Can you elaborate? Thanks!
lauraclarke commented 4 years ago

@mhalushka by use cases I mean why is a field valuable in the context of building a cell atlas. What do we think the downstream use of that field will be.

I want us to be able to capture a diversity of fields but I want to make sure we understand the value of those fields in the context of building the cell atlas rather than just acting as a list of every field a tissue collection centre ever captures about their donors and biopsy/autopsy samples

These use cases can be diverse, some metadata will be more important from a discovery perspective, can people find the data they want to do their study and other fields will be more important in the analysis perspective, can they combine different data, understand how to analyse it etc

I just want to ensure as well as understanding how easy/difficult a field is to collect we also understand how useful a field is

zperova commented 4 years ago

@lauraclarke @mhalushka I think the approach of getting use cases from the contributors will be asking specific questions about the fields supplied at the time of collection (what information does this field capture, what is its importance for the experimental scientist, for the data analyst, for the computational scientist). This will require clever phrasing from our side and diligence from the contributors filling in the master list of the fields. There are further details in the document linked above.

mhalushka commented 4 years ago

@lauraclarke @zperova Thank you both. I now understand and agree that is important. It's a little tricky because sometimes you don't know how data will be helpful until it is collected and sometimes data isn't collected that we wish we had. But - we should have possible use cases (or at least general fields) of how each datapoint would be used. We can add them to the voting system. For example, something like this might work: Data point: Postmortem Interval / Use case: Technical quality of RNA Vote Importance: A/B/C / Vote Ease of Obtaining A/B/C

Data point: Organ location / Use case: Understand the source of the material Vote Importance: A/B/C / Vote Ease of Obtaining A/B/C