Closed theathorn closed 3 years ago
Slack from Tony Burdett:
I will preface my answer by saying some of it depends on what makes sense to users (wranglers make good proxy users!) and what the schema currently says.
The schema currently says: Values for organ can be any UBERON material anatomical entity (UBERON_0000465) Values for organ part can also be any UBERON material anatomical entity Values for model organ can also also be any UBERON material anatomical entity.
Now, the wranglers will have done a decent job of sorting this mess out, but there will be errors. So the right answer is really purely dependent on the values the various wranglers have put in these fields over the years, and how messy it has gotten. Therefore what follows is a guess…
I suggest we rename organ -> anatomical entity and then leave organ part and model organ the same. We should probably do an audit of the data to check it isn’t too messy. In the meantime, I don’t think anyone will quibble this layout too hard (but I may be wrong).
Email from Tony Burdett to John Randell:
Following up on my previous email, KT, Trevor and I have discussed this internally and done some investigation with our teams. I wanted to let you know a little bit about our plans in case you had any concerns or wanted to provide any additional feedback.
Our immediate action to your request will be to change the labelling of “organ” to “anatomical entity”. This more accurately reflects the content we’ve collected, whilst providing for minimal immediate disruption. It’s also something we can enact quickly - we expect a 2 week turnaround time - rather than a much more substantive change that may take some time.
Secondly, there are some longer term enhancements needed. Previously I flagged a couple of issues (“what is an organ?” and “consistent labelling”), and our investigation found that there are actually only a very small number of issues relating to labelling - three or four cases like “Pancreas” and “pancreas” in the organ fields. Vastly more are due to a vague definition of organ, either • Recording anatomical terms inaccurately as organs due to a lack of an better place to record the detail (e.g. “embryo”) • Recording anatomical terms technically accurately according to the UBERON ontology, but which looks odd when compared to common usage (e.g. “molar tooth”) These bigger issues require a number of interventions to our data model and curation of the data itself and may take some time to work through.
We considered completely removing the “organ” data altogether until we resolved these secondary changes, but we felt the DCP benefits from showing anatomical information and if we simply removed it whilst we correctly structure organs, it would be difficult to judge how long it might take to restore. I have also been in discussion with Ellen Todres about how to better align DCP standards around anatomy to the terminology used by the bionetworks, and in parallel I had a conversation with Katy Borner of HubMap about their common coordinate framework. I think there’s useful momentum around anatomy and coordinate frameworks right now, and the anatomical metadata in the DCP is far richer than most other initiatives have collected, and presenting the data as clearly as possible - even if in places it is somewhat ambiguous - is a very useful conversation starter to gather feedback. We do agree it is important to not be visibly scientifically inaccurate though (!) which is why we opted for the relabelling approach.
Please let us know if you have any concerns about this approach, and please feed these plans back to others in the executive office if it is useful to align further.