ncihtan / data-models

Schema.org Data Models for HTAN
MIT License
14 stars 7 forks source link

Implement precancer/clinical data updates and testing #306

Closed aclayton555 closed 2 months ago

aclayton555 commented 10 months ago

Working with Kristen Anton to start implementing data model changes (maybe in separate branch). Consider proposed updates presented here: https://docs.google.com/presentation/d/1yWnmNKhEe9uXDMzdLa77DDPyQqh-54Lkq2ywSCztgjA/edit?usp=sharing

aclayton555 commented 10 months ago

Pre-cancer atlases: BU, Duke, HMS, Stanford

aclayton555 commented 10 months ago

@aclayton555 set up time with Kristen to confirm approach and timelines for implementing changes in renewal, and how we want to implement these. Think about what we want to write into the renewal. Recommend that Kristen also connect with CRDC.

adamjtaylor commented 9 months ago

We met with Kirsten on 2 Nov. Backlogged once we have concrete plans for data model implementation

aclayton555 commented 9 months ago

Notes from discussion with Kristen here: https://docs.google.com/document/d/18Krq5x-5dGcPxC9X_B24S5JMv0iXILF5_qwpvpUk4_U/edit#bookmark=id.8n4luqvw09n4

aclayton555 commented 4 months ago

Also revisit updates of optional -> required fields, as outlined in # 2 of https://github.com/ncihtan/data-models/issues/355

kristenanton commented 3 months ago

Propose a new (simple) manifest called 'Clinical Data Update' that will allow DCC to capture two important bits of information about cases: vital status, and precancer designation/description.

  1. Capture of Vital Status Manifest would contain data element 'Vital Status' with permissible values 'Alive, Dead, unknown, Not Reported.' This data element is included in the Clinical Tier 1 Demographics manifest at present (of which we expect only one record per participant). If the vital status of the patient is 'Dead' we will require the following data: 'Days to Death, Cause of Death, Cause of Death Source.' These data elements are included in the Clinical Tier 1 Demographics manifest at present. If the vital status is 'Alive, unknown, Not Reported' we will require a new data element: 'Days to Vital Status Reference' defined as 'Number of days between the date used for index and the reference date for designation of vital status.' The reference date for this update is important to researchers.

  2. Identification/Annotation of precancer cases Manifest would contain the data element 'Precancer Case' with permissible values 'Yes, No, Not Reported, unknown.' This is a new data element (does not exist in the current model). If 'Precancer Case' is Yes, we will require the following data: 'Days to Precancer Case Designation' (defined as 'Number of days between the date used for index and the reference date for designation of precancer status.') and 'Precancerous Condition Type. 'Days to Precancer Case Designation' is a new data element. 'Precancerous Condition Type' is included in the Clinical Tier 1 Diagnosis manifest at present. If 'Precancer Case' does not equal 'Yes' there is no additional data required. NOTE: We may want to allow the premalignant lesion to be coded. This may be done using the WHO Classification of Tumour, 5th edition (akin to ICD-O codes) however this approach is more complicated than simply using the three data elements listed above.

This information is contained in the attached Excel spreadsheet. UpdateToClinicalData_Template_Proposed.xlsx

kristenanton commented 3 months ago

Update to 'Precancerous Condition Type' permissible value list is logged in ticket #334

aclayton555 commented 3 months ago

I think the above comment was meant to be linked to: https://github.com/ncihtan/data-models/issues/392

kristenanton commented 2 months ago

@adamjtaylor I would suggest we not list specific WHO Tumour Classification v5 codes at this time (I am not sure I can find a comprehensive list - and the full list is by subscription only).
I like your idea of validating the format (which is identical to ICD-O-3 coding, e.g., 8010/2) to reduce errors. Thank you

adamjtaylor commented 2 months ago

@kristenanton

regarding

If the vital status is 'Alive, unknown, Not Reported' we will require a new data element: 'Days to Vital Status Reference' defined as 'Number of days between the date used for index and the reference date for designation of vital status.' The reference date for this update is important to researchers.

I not able to implement this for unknown or Not Reported due to namespace clashes in the data model that will lead to unexpected behaviour in other components using these values.

I am hoping to be able to implement for Alive