Closed mgurley closed 5 years ago
We could. And we don't have to map to 0. We just don't map, which means, the ETL will put in 0.
@dimshitc No ingesting of any flavor of 'Unknown'.
Wait. We would ingest, but not map to anything, and not make it standard. Right?
Right, ETL should be able distinguish between the real number and the code for unknown. Gleason's score = 5 means "5" goes to value_as_number Gleason's score = 998 (No prostatectomy/autopsy performed) means the value_as_concept_id =0. So we need to have the list of the source concepts indicating Unknown. Just non standard without mapping.
I think currently for flavor of null (have no value) is NULL, not 0.
There seem to be quite a few flavors of unknown that are still standard concepts. I think we should revisit the decision to make flavors of unknown to be non-standard. Trying to determine all that should be made non-standard has enormous curation overhead. Here is just a sampling on numeric concepts. Many should indeed be standard concepts; but many others should not.
value_as_concept_id | concept_name | concept_code | standard_concept |
---|---|---|---|
35934209 | No involved regional nodes | melanoma_nasal_cavity@2880@000 | S |
35933329 | No surgical specimen from primary site | sinus_maxillary@2865@998 | S |
35939229 | "Ascites present, determined to be non-malignant" | ovary@2920@991 | S |
35934159 | Test not done (test not ordered and not performed) | breast@2866@998 | S |
35936976 | "Described as ""greater than 5 mm""" | colon@2930@996 | S |
35933006 | No para-aortic lymph nodes examined | corpus_carcinoma@2920@098 | S |
35919073 | "Tumor Deposits identified, number unknown" | 3934@X2 | S |
35936528 | "Described as ""less than 45 cm"" or ""greater than 40 cm"" or ""between 40 and 45 cm""" | esophagus_gejunction@2910@995 | S |
35941045 | No sentinel nodes were biopsied | 835@98 | S |
35940004 | "Regional lymph node(s) involved, size not stated;Unknown if regional lymph nodes involved;Not documented in patient record" | oropharynx@2880@999 | S |
35929615 | "Pelvic lymph nodes surgically removed, but number of nodes unknown/not stated and not documented as a sampling or dissection" | corpus_carcinoma@2910@098 | S |
35923183 | "Biopsy cores examined, number unknown" | prostate@2867@991 | S |
35926018 | No regional lymph nodes involved | bladder@2890@000 | S |
35933485 | "No resection of primary site ;Surgical procedure did not remove enough tissue to measure the CRM;(Examples include: polypectomy only, excision of tumor only or excisional biopsy only)" | rectum@2930@998 | S |
35919025 | Stated as 91-100% | 3826@R99 | S |
35927687 | "Poorly differentiated tumor present, percent not stated" | penis@2865@990 | S |
35923562 | No histologic specimen from primary site | pancreas_other@2900@998 | S |
35936175 | No pelvic nodes examined | corpus_sarcoma@2900@098 | S |
35935647 | No lung metastasis resected | bone@2910@000 | S |
35935629 | "Positive nodes, number unspecified" | breast@2900@097 | S |
35925275 | Test not done (test not ordered and not performed) | melanoma_skin@2930@998 | S |
35937834 | Ascites not assessed | ovary@2920@998 | S |
35940097 | 9.80 millimeters or larger ;(Includes cases converted from codes 981-989 during conversion to V0200) | melanoma_skin@2880@980 | S |
35921422 | Test not done (test not ordered and not performed) | breast@2877@998 | S |
35934941 | Test not done (test not ordered and not performed) | pancreas_body_tail@2890@998 | S |
35925130 | Microinvasion; microscopic focus or foci only and no depth given;Not documented in patient record;Unknown; depth not stated | melanoma_skin@2880@999 | S |
35924599 | No involved regional nodes | palate_soft@2880@000 | S |
35929661 | "Margins clear, distance from tumor not stated;CRM negative, NOS" | rectum@2930@991 | S |
35919794 | "Margins clear, distance from tumor not stated;Circumferential or radial resection margin negative, NOS;No residual tumor identified on specimen" | 3823@XX.1 | S |
35927141 | Test not done (test not ordered and not performed) | pancreas_head@2890@998 | S |
35926762 | No ascites present | ovary@2920@995 | S |
35919717 | "Positive nodes, number unspecified" | 3882@X5 | S |
35924035 | No para-aortic nodes examined. | corpus_sarcoma@2930@000 | S |
35921592 | All pelvic nodes examined negative. | corpus_sarcoma@2900@000 | S |
35937584 | Test not done (test not ordered and not performed) | small_intestine@2900@998 | S |
35929763 | All nodes examined negative for cancer involvement;All nodes examined negative for extracapsular tumor | esophagus_gejunction@2900@000 | S |
35937192 | "Regional lymph node(s) involved, size not stated;Unknown if regional lymph nodes involved;Not documented in patient record" | parotid_gland@2880@999 | S |
35926587 | No pelvic nodes examined | corpus_carcinoma@2900@098 | S |
35935968 | All para-aortic lymph nodes examined negative | corpus_carcinoma@2920@000 | S |
35920317 | Test not done (test not ordered and not performed) | rectum@2900@998 | S |
35928156 | "Stated as ""less than 1 mitosis/square mm"";Stated as ""nonmitogenic""" | melanoma_skin@2861@990 | S |
35938930 | Ratio of less than 1.00 | breast@2864@991 | S |
35932760 | No needle core biopsy performed | prostate@2867@998 | S |
35923385 | No needle core biopsy performed | prostate@2866@998 | S |
35936197 | "No resection of primary site ;Surgical procedure did not remove enough tissue to measure the CRM;(Examples include: polypectomy only, excision of tumor only or excisional biopsy only)" | colon@2930@998 | S |
35921947 | 98.0 or greater U/ml | bile_ducts_intrahepat@2866@980 | S |
35926869 | 98.0 or greater ng/ml | rectum@2900@980 | S |
35930032 | No involved regional nodes | salivary_gland_other@2880@000 | S |
35931178 | No histopathologic examination of lymph nodes | esophagus_gejunction@2900@998 | S |
35921452 | No histologic specimen from primary site | pancreas_head@2900@998 | S |
35937902 | 98.0 or greater U/ml | pancreas_head@2880@980 | S |
35937853 | No axillary nodes examined | breast@2900@098 | S |
35926779 | No histologic examination of primary site;AND/OR;No neoadjuvant chemotherapy | bone@2900@998 | S |
35926474 | No histologic examination of primary site;Test not done (test not ordered and not performed) | cns_other@2890@998 | S |
35935843 | No regional lymph node(s) involved | penis@2870@000 | S |
35933389 | Test not done (test not ordered and not performed) | breast@2864@998 | S |
35924616 | No regional lymph node(s) involved | kidney_parenchyma@2861@000 | S |
35930350 | No histologic examination of primary site. | melanoma_skin@2861@998 | S |
35919126 | "Not applicable, invasive case" | 3903@XX6 | S |
35927137 | No histologic examination of primary site;Test not done (test not ordered and not performed) | brain@2890@998 | S |
35919737 | Stated as 91-100% | 3914@R99 | S |
35942004 | "Sentinel lymph nodes were biopsied, but the number is unknown" | 834@98 | S |
35931557 | 98.0 ng/ml or greater | prostate@2880@980 | S |
35930309 | No involved regional nodes | sinus_maxillary@2880@000 | S |
35922673 | "Positive nodes, not stated if extracapsular tumor present" | esophagus_gejunction@2900@990 | S |
35919590 | "PR negative, or stated as less than 1%" | 3914@000 | S |
35919105 | No tumor deposits | 3934@00 | S |
35935407 | No prostatectomy/autopsy performed | prostate@2864@998 | S |
35923109 | No surgical resection of primary site | colon@2910@998 | S |
35931111 | "Regional lymph node(s) involved, size not stated;Unknown if regional lymph nodes involved;Not documented in patient record" | salivary_gland_other@2880@999 | S |
35938317 | "Described as ""less than 45 cm"" or ""greater than 40 cm"" or ""between 40 and 45 cm""" | esophagus_gejunction@2920@995 | S |
35919049 | Stated as 1-10% | 3914@R10 | S |
35919519 | "ER negative, or stated as less than 1%" | 3826@000 | S |
35929079 | No involved regional nodes | nasal_cavity@2880@000 | S |
35920861 | "TD identified, number unknown" | rectum@2910@990 | S |
35934701 | 0.0 mitoses per 10 high-power fields (HPF) (40x field);0.0 mitoses per 2 square millimeters (mm);Mitoses absent;No mitoses present | net_small_intestine@2930@000 | S |
35921147 | No pelvic lymph nodes examined | corpus_carcinoma@2910@000 | S |
35919087 | "Not applicable, in situ case" | 3904@XX6 | S |
35925936 | No pelvic lymph nodes examined | corpus_sarcoma@2910@000 | S |
35941132 | It is unknown whether sentinel nodes are positive; not applicable; not stated in patient record | 835@99 | S |
35922956 | No surgical resection of primary site | rectum@2910@998 | S |
35919455 | No mass/tumor found | 754@000 | S |
35919382 | All ipsilateral axillary nodes examined negative | 3882@00 | S |
35919582 | Positive aspiration or needle core biopsy of lymph node(s) | 3882@X6 | S |
35922662 | No involved regional nodes | parotid_gland@2880@000 | S |
35929093 | 11 or more mitoses per square mm | melanoma_skin@2861@011 | S |
35931549 | "TD identified, number unknown" | colon@2910@990 | S |
35932575 | Test not done (test not ordered and not performed) | pancreas_other@2890@998 | S |
35935200 | Test not done (test not ordered and not performed) | stomach@2869@998 | S |
35936908 | 98.0 or greater ng/ml | colon@2900@980 | S |
35930658 | No histopathologic examination of lymph nodes | esophagus@2900@998 | S |
35921171 | 0 mitoses per square millimeter (mm);Mitoses absent;No mitoses present | melanoma_skin@2861@000 | S |
35933938 | No histologic specimen from primary site | pancreas_body_tail@2900@998 | S |
35922253 | "Described as ""less than 1 centimeter (cm)"";;Stated as T1b with no other information on tumor size" | breast@2800@991 | S |
35937985 | "Lung metastasis resected, number unknown" | bone@2910@099 | S |
35936489 | Test not done (test not ordered and not performed) | stomach@2868@998 | S |
35941398 | "Positive sentinel nodes are documented, but the number is unspecified; For breast ONLY: SLN and RLND occurred during the same procedure" | 835@97 | S |
35926531 | No para-aortic nodes examined. | corpus_carcinoma@2930@000 | S |
35922647 | No mass/tumor found | melanoma_skin@2880@000 | S |
35928512 | No needle core biopsy/TURP performed | prostate@2862@998 | S |
35936543 | Renal parenchymal invasion not present/not identified | kidney_renal_pelvis@2890@000 | S |
35938496 | No involved regional nodes | tongue_base@2880@000 | S |
35919937 | "Described as ""less than 2 cm,"" or ""greater than 1 cm,"" or ""between 1 cm and 2 cm"";;Stated as T1 [NOS] or T1c [NOS] with no other information on tumor size" | breast@2800@992 | S |
35941854 | It is unknown whether sentinel nodes were examined; not applicable; not stated in patient record | 834@99 | S |
35936415 | No residual tumor identified on specimen | rectum@2930@990 | S |
35935470 | "Biopsy cores positive, number unknown" | prostate@2866@991 | S |
35939622 | No para-aortic lymph nodes examined | corpus_sarcoma@2920@098 | S |
35926715 | Test not done (test not ordered and not performed) | melanoma_skin@2920@998 | S |
35919854 | "No resection of primary site ;Surgical procedure did not remove enough tissue to measure the circumferential or radial resection margin;(Examples include: polypectomy only, endoscopic mucosal resection (EMR), excisional biopsy only, transanal disk excisi" | 3823@XX.7 | S |
35926455 | 98.0 or greater U/ml | pancreas_body_tail@2880@980 | S |
35928846 | "Margin IS involved with tumor;Circumferential resection margin (CRM) positive;Described as ""less than 1 millimeter (mm)""" | rectum@2930@000 | S |
35923874 | "Described as ""less than 3 cm"" or ""greater than 2 cm"" or ""between 2 cm and 3 cm""" | kidney_parenchyma@2861@993 | S |
35934036 | No surgical resection of primary site | kidney_renal_pelvis@2890@998 | S |
35939791 | "Margins clear, distance from tumor not stated;CRM negative, NOS" | colon@2930@991 | S |
35934919 | "CONVERTED AND CODE REUSED V0203 Prior to V0203 code defined as ""Test ordered, results not in chart"". Cases converted to code 997 with V0203 and code 998 redefined as ""Test not done (test not ordered and not performed)"". Test not done (test not ordered an" | net_small_intestine@2865@998 | S |
35938506 | All pelvic nodes examined negative. | corpus_carcinoma@2900@000 | S |
35921457 | All ipsilateral axillary nodes examined negative | breast@2900@000 | S |
35929268 | No mass/tumor found | breast@2800@000 | S |
35922738 | None | rectum@2910@000 | S |
35938569 | "CONVERTED AND CODE REUSED V0203;Prior to V0203 code defined as ""Test ordered, results not in chart"". Cases converted to code 997 with V0203 and code 998 redefined as ""Test not done (test not ordered and not performed)"".;;Test not done (test not ordered a" | net_small_intestine@2866@998 | S |
35934664 | No residual tumor identified on specimen | colon@2930@990 | S |
35925693 | "Regional lymph node(s) involved, size not stated;Unknown if regional lymph node(s) involved;Not documented in patient record" | kidney_parenchyma@2861@999 | S |
35937635 | "Described as ""greater than 5 cm"";;Stated as T3 with no other information on tumor size" | anus@2800@996 | S |
35925788 | None | colon@2910@000 | S |
35922350 | Test not done (test not ordered and not performed) | colon@2900@998 | S |
35934881 | 98.0 or greater U/ml | pancreas_other@2880@980 | S |
35919704 | Stated as 71-80% | 3914@R80 | S |
Well, there are flavors of null, where the information is not available, like 35933329 "No surgical specimen from primary site" or 35934159 "Test not done (test not ordered and not performed)". We don't need them. They add no information. Every day of my life that particular "Test" was not ordered or performed, thank God. Then we have the results of the TNM assessment, like 5940004 "Regional lymph node(s) involved, size not stated" or 35926018 "No regional lymph nodes involved". 35929615 "Pelvic lymph nodes surgically removed, but number of nodes unknown/not stated and not documented as a sampling or dissection" probably should be mapped to a surgery procedure, rather than to a value. And then there are those which are not flavors of null at all, like 35919025 "Stated as 91-100%" or 35937902 "98.0 or greater U/ml". We said we will take care of them later, and right now having them as Standard won't hurt.
No?
Right, and 35933329 "No surgical specimen from primary site" and 35934159 "Test not done (test not ordered and not performed)" are currently standard concepts. So based on our existing intentions, they should not be standard. My point is that correcting these and the possibly many more could be a great curation burden. I am fine with sticking to our plan but if so, we likely have a lot of clean up to do.
How much?
@cgreich If I could automate it, it would not be a burden. I just noticed some that seemed wrong. Likely the full set will need to take a second pass curation.
We decided that all flavors of unknown beyond Treatment concepts should be made non-standard. Treatment unknowns should be standard in the Observation domain.
Specific problems will need to be handled in separate issues.