monarch-initiative / mondo-ingest

Coordinating the mondo-ingest with external sources
https://monarch-initiative.github.io/mondo-ingest/
6 stars 3 forks source link

Weird `" "` in `ICD10CM:I70-I79` label in `mondo.sssom.tsv` #517

Open joeflack4 opened 1 month ago

joeflack4 commented 1 month ago

Overview

I just noticed there is a weird extra " " in the middle of the label for ICD10CM:I70-I79 in mondo.sssom.tsv that I recently built in mondo-ingest.

Subsection of mondo.sssom.tsv (it's in the middle row):

"subject_id subject_label   predicate_id    object_id   object_label    mapping_justification"
"MONDO:0005385  vascular disorder   skos:exactMatch ICD10CM:I00-I99 Diseases of the circulatory system (I00-I99)    semapv:UnspecifiedMatching"                                                     
"MONDO:0005385  vascular disorder   skos:exactMatch ICD10CM:I70-I79 Diseases of arteries"   " arterioles and capillaries (I70-I79)  semapv:UnspecifiedMatching"                                                 
"MONDO:0005385  vascular disorder   skos:exactMatch NCIT:C35117 Vascular Disorder   semapv:UnspecifiedMatching"                                                 

Investigation

No idea what's causing this. I looked in various files where the label for ICD10CM:I70-I79 appears in both mondo-ingest and mondo repos, but I'm only seeing this in the mondo.sssom.tsv for ICD10CM:I70-I79. Note that I think this problem has appeared the last 2 times I've built mondo.sssom.tsv in mondo-ingest. The mondo.sssom.tsv in mondo doesn't have this issue; it says that it was last updated on May 9 17:38, though the git blame for the row says it hasn't changed since 3/24/2023.

joeflack4 commented 1 month ago

@matentzn @twhetzel FYI super low priority but weird.