Closed saramsey closed 3 years ago
Okay I started investigating this, and figured I would start by seeing if I could find the concept in any of the umls build files. I did so by running find . -name "umls*" -type f -exec grep C0178572 {} +;
from the kg2-build
directory, which returned the following:
./umls-mth.json: "obj" : "http://purl.bioontology.org/ontology/MTH/C0178572"
./umls-mth.ttl: <http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0178572> ;
All of the links lead to " the page you are looking for wasn't found". Not super sure where to go from here.
@saramsey could you walk me through how and where you found the info mentioned in this comment?
OK, here is what KG2.3.5 knows about UMLS:C0178572
:
{
"iri": "https://identifiers.org/umls:C0178572",
"category_label": "named_thing",
"deprecated": "False",
"provided_by": "umls_source:MTH",
"id": "UMLS:C0178572",
"category": "biolink:NamedThing",
"update_date": "2020"
}
which I got by running this Cypher query:
match (n {id: 'UMLS:C0178572'}) return n;
Note that there is no name
field, which is kind of weird (I guess it is null
so not displayed by Neo4j).
But going to the hyperlink https://identifiers.org/umls:C0178572
, I see that this node does have a name according to Linked Life Data, it's name is court
:
So, @kvarforl can you please check umls-mth.ttl
to verify that UMLS concept C0178572
has no name field, at least in that TTL file? You can just paste the TTL record for that concept, here in the issue.
hmm okay, here are the contents of the only occurrence of C0178572
, from running grep -C 10 "C0178572" kg2-build/umls-mth.ttl
on kg2steve:
<http://purl.bioontology.org/ontology/MTH/C0022433> a owl:Class ;
skos:prefLabel """Principles of law and justice"""@en ;
skos:notation """C0022433"""^^xsd:string ;
<http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0016556> ;
<http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0680513> ;
<http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0016557> ;
<http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0086530> ;
<http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0178572> ;
<http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0178675> ;
<http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0014649> ;
<http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0013277> ;
<http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0220868> ;
<http://purl.bioontology.org/ontology/MTH/RO> <http://purl.bioontology.org/ontology/MTH/C0362060> ;
UMLS:has_cui """C0022433"""^^xsd:string ;
UMLS:has_tui """T064"""^^xsd:string ;
UMLS:has_sty <http://purl.bioontology.org/ontology/STY/T064> ;
.
Just rejoining this thread after a month (sorry). So, it looks like the UMLS metathesaurus file umls-mth.ttl
file may be incomplete, since it doesn't appear to define the concept http://purl.bioontology.org/ontology/MTH/C0178572
but it does clearly cross-reference it. You could check the umls-mth.ttl
file to see if it appears to be truncated (IIRC, it should end with a bunch of turtle statements about semantic types, if it is complete). Other options include checking the UMLS Mysql database to see if there is a row in the MRCONSO
table correpsonding to UMLS concept C0178572
.
hmm okay, the tail of umls-mth.ttl
looks like this:
ubuntu@ip-172-31-59-26:~$ tail kg2-build/umls-mth.ttl
<http://purl.bioontology.org/ontology/STY/T025> rdfs:subClassOf <http://purl.bioontology.org/ontology/STY/T021> .
<http://purl.bioontology.org/ontology/STY/T091> rdfs:subClassOf <http://purl.bioontology.org/ontology/STY/T090> .
<http://purl.bioontology.org/ontology/STY/T203> rdfs:subClassOf <http://purl.bioontology.org/ontology/STY/T074> .
<http://purl.bioontology.org/ontology/STY/T042> rdfs:subClassOf <http://purl.bioontology.org/ontology/STY/T039> .
<http://purl.bioontology.org/ontology/STY/T020> rdfs:subClassOf <http://purl.bioontology.org/ontology/STY/T190> .
<http://purl.bioontology.org/ontology/STY/T102> rdfs:subClassOf <http://purl.bioontology.org/ontology/STY/T077> .
<http://purl.bioontology.org/ontology/STY/T129> rdfs:subClassOf <http://purl.bioontology.org/ontology/STY/T123> .
<http://purl.bioontology.org/ontology/STY/T049> rdfs:subClassOf <http://purl.bioontology.org/ontology/STY/T046> .
<http://purl.bioontology.org/ontology/STY/T046> rdfs:subClassOf <http://purl.bioontology.org/ontology/STY/T038> .
<http://purl.bioontology.org/ontology/STY/T204> rdfs:subClassOf <http://purl.bioontology.org/ontology/STY/T001> .
which, to my mostly untrained eye, looks like a bunch of turtle statements about semantic types.
OK, in that case I think the next step is to search the table MRCONSO
to see if CUI C0178572
is in there.
https://www.ncbi.nlm.nih.gov/books/NBK9685/table/ch03.T.concept_names_and_sources_file_mr/
In the MRCONSO
table in the umls
MySQL database on kg2lindsey.rtx.ai
, we have:
mysql> select * from MRCONSO where CUI='C0178572';
+----------+-----+----+----------+-----+-----------+--------+-----------+------------+------------+-----------+-----+-----+------------+---------+-----+----------+------+
| CUI | LAT | TS | LUI | STT | SUI | ISPREF | AUI | SAUI | SCUI | SDUI | SAB | TTY | CODE | STR | SRL | SUPPRESS | CVF |
+----------+-----+----+----------+-----+-----------+--------+-----------+------------+------------+-----------+-----+-----+------------+---------+-----+----------+------+
| C0178572 | ENG | P | L0215094 | PF | S0288834 | N | A0318718 | NULL | NULL | 2724-8820 | CSP | PT | 2724-8820 | court | 0 | N | 256 |
| C0178572 | ENG | P | L0215094 | PF | S0288834 | Y | A18577800 | 0000060808 | 0000018320 | NULL | CHV | PT | 0000018320 | court | 0 | N | 256 |
| C0178572 | ENG | P | L0215094 | VO | S11872390 | Y | A18596387 | 0000060809 | 0000018320 | NULL | CHV | SY | 0000018320 | courted | 0 | N | 256 |
| C0178572 | ENG | P | L0215094 | VO | S11872392 | Y | A18596388 | 0000060810 | 0000018320 | NULL | CHV | SY | 0000018320 | courts | 0 | N | 256 |
| C0178572 | ENG | P | L0215094 | VO | S1220140 | Y | A7564539 | 12220 | NULL | NULL | PSY | ET | 12220 | Courts | 3 | N | NULL |
+----------+-----+----+----------+-----+-----------+--------+-----------+------------+------------+-----------+-----+-----+------------+---------+-----+----------+------+
The columns of the MRCONSO table are explained here: https://www.ncbi.nlm.nih.gov/books/NBK9685/table/ch03.T.concept_names_and_sources_file_mr/
Looks like the CUI C0178572
came from UMLS sources CHV
, CSP
, and PSY
. What are those sources?
CHV
is the Consumer Health Vocabulary, which we do not currently include in KG2PSY
is the Phychological Index Terms, which we do not currently include in KG2CSP
is the CRISP Thesaurus, which we do not currently include in KG2So, if the three UMLS sources that have terms that map to CUI C0178572
, none of them are in KG2. That would seem to explain why C0178572
is not in KG2. Of these, it seems the most reasonable might be to add the PSY
to KG2.
after adding PSY
to umls.conf
and rerunning umls2rdf.py
on kg2lindsey.rtx.ai
, the following TTL block shows up in the newly generated file umls-psy.ttl
:
<http://purl.bioontology.org/ontology/PSY/12220> a owl:Class ;
skos:prefLabel """Courts"""@en ;
skos:notation """12220"""^^xsd:string ;
<http://purl.bioontology.org/ontology/PSY/use> <http://purl.bioontology.org/ontology/PSY/00840> ;
<http://purl.bioontology.org/ontology/PSY/PYR> """1973"""^^xsd:string ;
UMLS:has_cui """C0178572"""^^xsd:string ;
UMLS:has_tui """T092"""^^xsd:string ;
UMLS:has_sty <http://purl.bioontology.org/ontology/STY/T092> ;
which would seem to define CUI C0178572
as expected.
OK, I think this should be fixed now. Testing needed.
This appears to be fixed in KG2.5.2. From Neo4j:
{
"iri": "https://identifiers.org/umls:C0178572",
"category_label": "agent",
"deprecated": "False",
"name": "Courts",
"provided_by": "identifiers_org_registry:umls",
"id": "UMLS:C0178572",
"category": "biolink:Agent",
"update_date": "2004"
}
See:
https://github.com/RTXteam/RTX/issues/1127#issuecomment-730517337