ncihtan / data-models

Schema.org Data Models for HTAN
MIT License
14 stars 7 forks source link

Near-duplicate valid values by title/lower-case differences #176

Open adamjtaylor opened 1 year ago

adamjtaylor commented 1 year ago

91 attributes have the same valid value when changed to lowercase.

These may need to be harmonised on a case by case basis.

Opening as a single issue here to decide on a coherent approach.

lowercase,original,length
alpha-1 antitrypsin,"['Alpha-1 Antitrypsin', 'Alpha-1 antitrypsin']",2
am,"['Am', 'am']",2
ascending colon,"['Ascending Colon', 'Ascending colon']",2
biliary disorder,"['Biliary Disorder', 'Biliary disorder']",2
blood draw,"['Blood Draw', 'Blood draw']",2
bone marrow,"['Bone Marrow', 'Bone marrow']",2
brain stem,"['Brain Stem', 'Brain stem']",2
broad ligament,"['Broad Ligament', 'Broad ligament']",2
carotid body,"['Carotid Body', 'Carotid body']",2
cervix uteri,"['Cervix Uteri', 'Cervix uteri']",2
descending colon,"['Descending Colon', 'Descending colon']",2
doc,"['DOC', 'doc']",2
dysphagia,"['Dysphagia', 'dysphagia']",2
dyspnea,"['Dyspnea', 'dyspnea']",2
ewing sarcoma,"['Ewing Sarcoma', 'Ewing sarcoma']",2
fallopian tube,"['Fallopian Tube', 'Fallopian tube']",2
false,"['False', 'false']",2
familial adenomatous polyposis,"['Familial Adenomatous Polyposis', 'Familial adenomatous polyposis']",2
fanconi anemia,"['Fanconi Anemia', 'Fanconi anemia']",2
female,"['Female', 'female']",2
frontal lobe,"['Frontal Lobe', 'Frontal lobe']",2
fundus of stomach,"['Fundus of Stomach', 'Fundus of stomach']",2
gorlin syndrome,"['Gorlin Syndrome', 'Gorlin syndrome']",2
hard palate,"['Hard Palate', 'Hard palate']",2
head face or neck nos,"['Head Face Or Neck Nos', 'Head face or neck NOS']",2
high risk,"['High Risk', 'High risk']",2
intermediate risk,"['Intermediate Risk', 'Intermediate risk']",2
json,"['JSON', 'json']",2
kaposi sarcoma,"['Kaposi Sarcoma', 'Kaposi sarcoma']",2
lacrimal gland,"['Lacrimal Gland', 'Lacrimal gland']",2
li-fraumeni syndrome,"['Li-Fraumeni Syndrome', 'Li-Fraumeni syndrome']",2
low risk,"['Low Risk', 'Low risk']",2
lymph node nos,"['Lymph Node NOS', 'Lymph node NOS']",2
lynch syndrome,"['Lynch Syndrome', 'Lynch syndrome']",2
m,"['M', 'm']",2
male,"['Male', 'male']",2
md,"['Md', 'md']",2
mr-minimal/marginal response,"['MR-Minimal/Marginal Response', 'MR-Minimal/Marginal response']",2
multiple myeloma,"['Multiple Myeloma', 'Multiple myeloma']",2
na,"['NA', 'Na']",2
nasal cavity,"['Nasal Cavity', 'Nasal cavity']",2
no,"['NO', 'No', 'no']",3
no source documentation,"['No Source Documentation', 'No source documentation']",2
not allowed to collect,"['Not Allowed To Collect', 'not allowed to collect']",2
not applicable,"['Not Applicable', 'Not applicable']",2
not done,"['Not Done', 'Not done']",2
not performed,"['Not Performed', 'Not performed']",2
not specified,"['Not Specified', 'Not specified']",2
other,"['Other', 'other']",2
paraspinal ganglion,"['Paraspinal Ganglion', 'Paraspinal ganglion']",2
parotid gland,"['Parotid Gland', 'Parotid gland']",2
pineal gland,"['Pineal Gland', 'Pineal gland']",2
pituitary gland,"['Pituitary Gland', 'Pituitary gland']",2
pneumonia,"['Pneumonia', 'pneumonia']",2
productive cough,"['Productive Cough', 'productive cough']",2
prostate gland,"['Prostate Gland', 'Prostate gland']",2
rectosigmoid junction,"['Rectosigmoid Junction', 'Rectosigmoid junction']",2
round ligament,"['Round Ligament', 'Round ligament']",2
rubinstein-taybi syndrome,"['Rubinstein-Taybi Syndrome', 'Rubinstein-Taybi syndrome']",2
sigmoid colon,"['Sigmoid Colon', 'Sigmoid colon']",2
skin nos,"['Skin NOS', 'skin NOS']",2
skin of abdomen,"['Skin of abdomen', 'skin of abdomen']",2
skin of back,"['Skin of back', 'skin of back']",2
skin of breast,"['Skin of breast', 'skin of breast']",2
skin of chest,"['Skin of chest', 'skin of chest']",2
skin of ear,"['Skin of ear', 'skin of ear']",2
skin of eye lid,"['Skin of eye lid', 'skin of eye lid']",2
skin of lip,"['Skin of lip', 'skin of lip']",2
skin of lower limb and hip,"['Skin of lower limb and hip', 'skin of lower limb and hip']",2
skin of neck,"['Skin of neck', 'skin of neck']",2
skin of nose,"['Skin of nose', 'skin of nose']",2
skin of other parts of face,"['Skin of other parts of face', 'skin of other parts of face']",2
skin of palm,"['Skin of palm', 'skin of palm']",2
skin of penis,"['Skin of penis', 'skin of penis']",2
skin of scalp,"['Skin of scalp', 'skin of scalp']",2
skin of scrotum,"['Skin of scrotum', 'skin of scrotum']",2
skin of sole,"['Skin of sole', 'skin of sole']",2
skin of upper limb and shoulder,"['Skin of upper limb and shoulder', 'skin of upper limb and shoulder']",2
skin of vulva,"['Skin of vulva', 'skin of vulva']",2
sleep apnea,"['Sleep Apnea', 'Sleep apnea']",2
spinal cord,"['Spinal Cord', 'Spinal cord']",2
thyroid gland,"['Thyroid Gland', 'Thyroid gland']",2
transverse colon,"['Transverse Colon', 'Transverse colon']",2
true,"['True', 'true']",2
turcot syndrome,"['Turcot Syndrome', 'Turcot syndrome']",2
undescended testis,"['Undescended Testis', 'Undescended testis']",2
unknown,"['UNKNOWN', 'Unknown', 'unknown']",3
unspecified,"['Unspecified', 'unspecified']",2
vas deferens,"['Vas Deferens', 'Vas deferens']",2
wheezing,"['Wheezing', 'wheezing']",2
yes,"['YES', 'Yes', 'yes']",3
aclayton555 commented 1 year ago

Discussed on 2023.03.08 mid sprint - this is not something we are going to go back and fix, but rather establish a rule to use going forward. Suggested that we use "Thyroid gland" style going forward. In an ideal world, Elvira would pursue "thyroid gland" (all lower case, unless something specific "PCR)

Side note: Brynn developed a script that check to see that valid values are used consistently if any new ones are added.

Action: submit a feature to provide user feedback if error is thrown due to case sensitivity. Right now, if a valid value does not validate, it is not obvious to a user

aclayton555 commented 1 year ago

Revisit this in the context of Kristen's review of clinical attributes. Also ongoing developments with FAIR (Mialy) to implement check. So timing is good.

adamjtaylor commented 11 months ago

Re-backlog, we will make these changes when other clinical data model changes are made