ncihtan / data-models

Schema.org Data Models for HTAN
MIT License
14 stars 7 forks source link

Morphology metadata and codes #334

Open aclayton555 opened 8 months ago

aclayton555 commented 8 months ago

This ticket originates following observations from CDS about morphology metadata mapping between the CDS metadata template and the HTAN Data Model.

CDS Metadata Template:

CDS morphology: The coded result of analyzing the microscopic anatomy of normal and abnormal cells and tissues of the specimen by examining a thin slice (section) under a light (optical) or electron microscope. The code represents the histology of the disease using the third edition of the International Classification of Diseases for Oncology.

HTAN Data Model:

HTAN Morphology: The third edition of the International Classification of Diseases for Oncology, published in 2000 used principally in tumor and cancer registries for coding the site (topography) and the histology (morphology) of neoplasms. The study of the structure of the cells and their arrangement to constitute tissues and, finally, the association among these to form organs. In pathology, the microscopic process of identifying normal and abnormal morphologic characteristics in tissues, by employing various cytochemical and immunocytochemical stains. A system of numbered categories for representation of data.

HTAN Histologic Morphology Code: Histologic Morphology Code, based on ICD-O-3. Any valid ICD-O-3 morphology code. See https://seer.cancer.gov/icd-o-3/ and https://seer.cancer.gov/icd-o-3/sitetype.icdo3.20200629.xlsx. Examples: 80510

Upon reviewing HTAN metadata, CDS notes: Both coded values and string values received for both HTAN attributes (e.g. Adenocarcinoma, NOS and 8380/3 found in both Morphology & Histologic Morphology Code

This is a known issue acknowledged and previously discussed by the HTAN DCC. There are no plans to fix this within the scope of HTAN 1.0, but it is something that should be considered in the overall data model strategy for the next phase of this project.

An opportunity is to consider and request a feature from FAIR Data that improves the user experience in navigating and inputing morphology-related valid values. Ideally, valid values could be provided in a human readable form (e.g. Comedo, Clinging), and selection of this value would populate the corresponding code. Alternatively, you could structure the valid values as "human-readable description (code)." Consider what level of validation could be built into this.

For now, putting this in the backlog. Consider this on an upcoming sprint and escalating a feature request accordingly.