cancerDHC / example-data

This repository is intended to act as a store of example data files from across the NCI Cancer Research Data Commons (CRDC) nodes in a number of formats.
MIT License
0 stars 3 forks source link

Add support for extracting the full sample/portion/analyte hierarchy from GDC #6

Open gaurav opened 3 years ago

gaurav commented 3 years ago

We currently use a single GDC query that only provides information on the case, along with lists of identifiers for the samples, portions, analytes, aliquots and slides. However, we need to make additional queries to actually retrieve the data associated with these. We should include this information in the downloaded data and demonstrate how to consider (or, more usefully, just build a transformation library that can retrieve all of this GDC data and export it as CRDC-H instance data).

As an example, GDC case TCGA-HNSC / TCGA-CV-7261 reports the following items in the data we have obtained from their service: