HumanCellAtlas / ontology

3 stars 1 forks source link

Investigate appropriate ontology for preservation and storage methods #55

Closed mshadbolt closed 2 years ago

mshadbolt commented 4 years ago

In our metadata schema, we have the module preservation_storage.json that has preservation method and storage method as enums. Ideally these should be ontologies but the EFO doesn't appear to have any terms for these methods.

I'd like to find out whether we should be adding these terms to the EFO or importing them from another ontology. A few searches found that the ERO ontology seems to have some appropriate terms but they may not have enough detail for our purposes.

Once we figure out which ontology to use and appropriate terms are added to our HCAO then we can move forward with ontologising the terms in the metadata schema.

paolaroncaglia commented 3 years ago

Hi @mshadbolt ,

Do you have a feeling for how many terms you'd need? EFO or HCAO don't import ERO terms atm, but we could create new EFO terms and xref ERO - that would also allow us to make the terms as specific as you'd need. @zoependlington could create a ROBOT template for you to provide info if that's helpful. Or let me know if you'd prefer to discuss this in person.

Thanks, Paola

mshadbolt commented 3 years ago

Hi @paolaroncaglia I am not sure what a ROBOT template is.

Currently we have them as an enums with the following values:

"ambient temperature",
                "cut slide",
                "fresh",
                "frozen at -70C",
                "frozen at -80C",
                "frozen at -150C",
                "frozen in liquid nitrogen",
                "frozen in vapor phase",
                "paraffin block",
                "RNAlater at 4C",
                "RNAlater at 25C",
                "RNAlater at -20C"
"cryopreservation in liquid nitrogen (dead tissue)",
                "cryopreservation in dry ice (dead tissue)",
                "cryopreservation of live cells in liquid nitrogen",
                "cryopreservation, other",
                "formalin fixed, unbuffered",
                "formalin fixed, buffered",
                "formalin fixed and paraffin embedded",
                "hypothermic preservation media at 2-8C",
                "fresh"
paolaroncaglia commented 3 years ago

Hi @mshadbolt , Thanks for clarifying. We could look at your list in terms of description of samples, rather than sample conservation/storage methods. Then the ‘specimen’ branch in EFO may be useful. I’m attaching a screenshot, see e.g. the existing terms 'RNAlater specimen’, 'fresh specimen’, 'frozen specimen’. Most terms are imported from OBI, but some are EFO. OBI itself doesn’t have more specific terms than those, but we could create EFO ones if you need detailed descriptions, e.g. EFO:NEW ‘frozen at -70C’ as a subclass of ‘frozen specimen’.

Screen Shot 2020-11-05 at 17 15 25

This may need more extended discussion, but it could be a start. Let us know what you think.

P.S. A ROBOT template is a special spreadsheet with pre-defined column headers where you can list ontology terms you need and input the usual information on them such as parents, definition etc. The template can then be used to add the terms to the ontology automatically rather than manually. It’s useful when you have several-to-many terms that need adding, as it’s quicker than editing the ontology manually.

paolaroncaglia commented 3 years ago

Hi @mshadbolt ,

I'm reviewing tickets in this tracker, in view of my meeting with David and Zoë this afternoon. Please let me know if there's any update on this request, and/or if you have any comments on my suggestion here. If you'd prefer to discuss in person, we could add to the agenda for our next monthly curators meeting (Dec 9th).

Thanks, Paola

paolaroncaglia commented 3 years ago

Hi @mshadbolt ,

Following up on my previous comment. If this discussion is still active/needed, I suspect that it may be more easily carried out in person, so feel free to add to the agenda for our next monthly curators meeting (Dec 9th). If the issue has been addressed or the discussion is no longer active, I guess the ticket may be closed. Meanwhile, I'll move it to the "Backlog" queue. Thanks!

(Update, Marion couldn't make the Dec meeting, so let's move to Jan.)

Paola

dosumis commented 3 years ago

Zoe's call - but I think it makes sense to add EFO terms that subclass OBI.

mshadbolt commented 3 years ago

Hi @paolaroncaglia sorry for not replying to this sooner.

I think an ROBOT template would be great to be able to add the terms we need to EFO.

I should point out it isn't a huge priority for us right now so if you want to de-prioritise and I will come back and enquire again when we are making updates to the schema we could do that too.

paolaroncaglia commented 3 years ago

Zoe's call - but I think it makes sense to add EFO terms that subclass OBI.

Update from today's curators call: @zoependlington agrees that it'd make sense to add EFO terms to describe storage/preservation methods (in terms of sample descriptions) as I suggested above. Summary plan (thanks @mshadbolt for confirming that it's not a huge priority):

paolaroncaglia commented 3 years ago

@mshadbolt I reviewed the HCA enums in https://github.com/HumanCellAtlas/ontology/issues/55#issuecomment-722423929 and I drafted a spreadsheet to map them to existing terms or to suggested new terms. Could you please take a look and let me know if you have any concern? Thanks.

dosumis commented 3 years ago

@rays22 to review.

dosumis commented 3 years ago

Note FBBI has some related terms -https://www.ebi.ac.uk/ols/ontologies/fbbi/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FFBbi_00000001&viewMode=All&siblings=false

Might at least be worth xrefs to.

paolaroncaglia commented 3 years ago

Hi @rays22 (cc @mshadbolt ), Just a heads up that if you'd like those preservation/storage terms to be available in HCAO, they'll need to be created (and released) in EFO first. The next EFO release is scheduled for April 15th, and I'll need some advance warning to make sure that all terms are added in time. When you have a chance, could you please review my spreadsheet here https://docs.google.com/spreadsheets/d/1Wnok_nXEOXLjoGUu4lfeSMFZ5n1rysWu5Z1djrQaYBA/edit#gid=0 and comment if any of my suggested terms do not represent your existing metadata schema/module appropriately. Thanks.

paolaroncaglia commented 3 years ago

Hi @dosumis ,

Note FBBI has some related terms -https://www.ebi.ac.uk/ols/ontologies/fbbi/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FFBbi_00000001&viewMode=All&siblings=false Might at least be worth xrefs to.

Some of those FBBI terms are not defined, but I'll add the following mappings unless you have any concern:

  1. HCA 'cut slide' = FBbi:00000026 'sectioned tissue'
  2. HCA 'paraffin block' = FBbi:00000020 'tissue in paraffin embedment'
  3. HCA 'cryopreservation, other' = FBbi:00000013 'cryofixed tissue'
  4. HCA 'formalin fixed, unbuffered', 'formalin fixed, buffered' and 'formalin fixed and paraffin embedded' are all more granular than FBbi:00000010 'formaldehyde fixed tissue', so that would be a broad mapping which I'd prefer to avoid.

Thanks, Paola

rays22 commented 3 years ago

It appears to me that some of the categories overlap. It is not always clear to me if we need separate terms for preservation and storage. metadata-schema/json_schema/module/biomaterial/preservation_storage.json "description": "Information relating to how a biomaterial was preserved and/or stored over a period of time." For example, do we need separate terms for

  1. frozen in liquid nitrogen
  2. cryopreservation in liquid nitrogen (dead tissue)
  3. cryopreservation of live cells in liquid nitrogen ? Is there a consensus in the field what dead tissue would mean?

Other than maybe simplifying some of the categories, the suggested terms and parent terms in the draft look sensible to me.

paolaroncaglia commented 3 years ago

Thanks @rays22 . In answer to your question "It is not always clear to me if we need separate terms for preservation and storage", to clarify, the labels and parents I suggest are in columns B and E respectively in this spreadsheet. I've "rephrased" all terms with a focus on the sample (specimen) description, rather than the preservation/storage method that the sample is subjected to. So the descriptions you list above would be represented as 'liquid nitrogen-frozen specimen' (where the focus is on how the sample was frozen, to differentiate it from e.g. 'vapor phase-frozen specimen' and 'frozen specimen preserved in liquid nitrogen (dead tissue)' 'frozen specimen preserved in liquid nitrogen (live cells)' where the focus is on what the specimen consists of (dead tissue vs. live cells). Personally I'm happy to have only the broader term 'liquid nitrogen-frozen specimen'; I included all terms in the spreadsheet that were in the json schema, on the grounds that if they were there they'd all be needed, but let me know if this is no longer the case. Thanks again for your feedback. Paola

paolaroncaglia commented 3 years ago

Update: at curators mtg today, Ray mentioned that those terms are not urgent for HCA and would require changing their metadata schema. SCA do not get that type of info often, so they wouldn't be in a hurry to create those terms, though they might use some of them if/when available. Consensus was that, if new terms are added, they could be broad rather than narrow (e.g. cryopreservation, without indication of temperature that could be captured otherwise). I'll move to Backlog and unassign; others may "revive" this ticket as/when necessary. Thanks. Update 11/2/2022: there's been no follow-up or interest in this area in the last 10 months: ok to close?

dosumis commented 2 years ago

@ESapenaVentura will look into this and decide whether to pursue.

ESapenaVentura commented 2 years ago

I have reviewed the conversation. It seems like we had a pretty laid out path to pursue this, but that it's going to take a lot of effort on the ontologists side(many new terms) and on the HCA side (Schema update + field migration for all entities in the HCA).

The value/cost is very low, at least for HCA. I don't know if SCEA would be interested in creating the terms anyways.

Nonetheless, what was discussed above is very valuable so, even though I think we can close this ticket, I am going to open a ticket in the metadata-schema repository and link to this, so we can use all the already-done research above when/if we prioritise this issue

paolaroncaglia commented 2 years ago

@ESapenaVentura thank you for following up. I'll close this ticket now; its content won't go lost, as your new ticket in the metadata-schema repo points to it.