Open nicolevasilevsky opened 11 months ago
Thanks @nicolevasilevsky for relaying this.
I just did a quick investigation to see how we're currently treating these types.
The types, defined by their prefix characters (*
, etc) are declared here in this short file omim_type.py
.
There is some branching logic that uses these types here. Some things I noticed which perhaps are in conflict with what Ada said:
*
and +
Maybe this isn't a conflict. I just didn't see the comment from Ada mention *
.
So I don't know if she would think they should be handled differently or the same, but currently they are are handled the same (at least in the file linked above):
elif omim_type == OmimType.GENE.value or omim_type == OmimType.HAS_AFFECTED_FEATURE.value: # * or +
omim_type = OmimType.GENE.value
graph.add((omim_uri, RDFS.label, Literal(abbrev)))
graph.add((omim_uri, RDFS.subClassOf, SO['0000704']))
graph.add((omim_uri, BIOLINK['category'], BIOLINK['Gene']))
^
and NULL
^
, but I don't know if that's a mistake or intentional.else: # ^ or NULL (no prefix character)
graph.add((omim_uri, RDFS.label, Literal(cleanup_label(label))))
I'm not sure how much of a concern this is; I marked as low urgency.
This needs to be raised at a tech call and has moderate urgency
Chris recommends writing unit tests in ingest and collect one example of each of these. This will be a comparison.
All ingest code should have unit tests with examples.
@nicolevasilevsky Crap, I realized now that we didn't discuss this one at the last meeting I don't think.
- There are some entries in OMIM that should be organized into PS, but have not been yet.
I don't know what the issue is here. Do you? Or do you think we need to discuss again at a meeting at some point?
I think this is something that should be discussed on a tech call. I don't remember all of the context.
Overview
From Ada: There are two big caveats for OMIM: Be concerned about (and likely exclude) null entries (those without a #, +, or % in front of them. There are some entries in OMIM that should be organized into PS, but have not been yet.
Sub-task list
Sub-task details
1. Unit tests for prefix types
Be concerned about (and likely exclude) null entries (those without a #, +, or % in front of them Chris recommends writing unit tests in ingest and collect one example of each of these. This will be a comparison.