Open RobertJCarroll opened 2 years ago
This query grabs the relevant files: https://kf-api-fhir-service.kidsfirstdrc.org/DocumentReference?type:text=Gene%20Expression&security-label=U
The lack of a vocabulary means it might not be capturing everything, though. There are some Gene Expression Quantification results also, but they look to be restricted access only.
Here is the number of breakdowns by study of the above resources:
_tag=SD_8Y99QZJJ
(PBTA-PNOC): 64_tag=SD_DYPMEHHF
(KF-NBL): 672 For Kids First Study - (PBTA-PNOC ResearchStudy/48656 SD_8Y99QZJJ Pediatric Brain Tumor Atlas: PNOC For a single example patient Patient/48592 there are 61 files Accessible file count by type {'tbi': 5, 'maf': 5, 'vcf': 5} Inaccessible file count by type {'tbi': 12, 'vcf': 11, 'maf': 10, 'bam': 7, 'cram': 2, 'crai': 2, 'bai': 1, 'gvcf': 1})
I believe rsem.genes.results.gz
files are the files we need for this.
Save JSON containing all DocumentReferences for KF Gene Expression Summary files into this bucket: https://console.cloud.google.com/storage/browser/fc-be286b9f-3acf-4168-af6e-592df509391d/DocumentReference
gs://fc-be286b9f-3acf-4168-af6e-592df509391d/DocumentReference