cancerDHC / example-data

This repository is intended to act as a store of example data files from across the NCI Cancer Research Data Commons (CRDC) nodes in a number of formats.
MIT License
0 stars 3 forks source link

Add demographic fields to gdc-head-and-mouth data #43

Closed jooho-lee-kim closed 2 years ago

jooho-lee-kim commented 2 years ago

This PR is updating gdc-head-and-mouth.json file to include demographic information. The query statement is also revised in Head and Mouth Cancer Datasets.ipynb

Closes #24

jooho-lee-kim commented 2 years ago

Some data fields are updated on PDC data.

The changes I've noticed in the downloaded PDC file (JSON format) are:

It would be better to update PDC data as well, since the data on the repo is downloaded 7 months ago.

gaurav commented 2 years ago

Some data fields are updated on PDC data.

Good catch! Apart from the examples you provided, I notice that some values have changed as well: for example, with case df4e907e-8f98-11ea-b1fd-0aad30af8a83, "days_to_last_follow_up" and "days_to_last_known_disease_status" both changed from 772 days to 1112.00 days, and "progression_or_recurrence" changed from "No" to "Yes". I guess PDC case information continues to be updated!

It would be better to update PDC data as well, since the data on the repo is downloaded 7 months ago.

I completely agree! I've updated that file in 36d4b85 (but feel free to regenerate or delete that if you notice anything wrong with this file). Otherwise, I think this PR is good to go!

gaurav commented 2 years ago