Closed hermidalc closed 3 months ago
Hi, There are only 25 cancer types with WGS data available in TCGA.
Hi, There are only 25 cancer types with WGS data available in TCGA.
If you go to GDC and to the Cohort Builder web tool and build a query with
PROGRAM <- TCGA
EXPERIMENTAL STRATEGY <- WGS
DATA TYPE <- Aligned Reads
This results in 8,840 cases out of 11,428 total TCGA cases. Switch to "Table View" and download the case TSV for these 8,840 cases.
Count the unique cases per project. There are 33 cancers with WGS data:
> cases_df <- read.delim("cases.tsv")
> data.frame(table(cases_df$project.project_id))
Var1 Freq
1 TCGA-ACC 74
2 TCGA-BLCA 411
3 TCGA-BRCA 952
4 TCGA-CESC 271
5 TCGA-CHOL 49
6 TCGA-COAD 371
7 TCGA-DLBC 42
8 TCGA-ESCA 118
9 TCGA-GBM 347
10 TCGA-HNSC 482
11 TCGA-KICH 86
12 TCGA-KIRC 124
13 TCGA-KIRP 216
14 TCGA-LAML 50
15 TCGA-LGG 461
16 TCGA-LIHC 324
17 TCGA-LUAD 464
18 TCGA-LUSC 337
19 TCGA-MESO 73
20 TCGA-OV 363
21 TCGA-PAAD 173
22 TCGA-PCPG 165
23 TCGA-PRAD 414
24 TCGA-READ 143
25 TCGA-SARC 223
26 TCGA-SKCM 223
27 TCGA-STAD 436
28 TCGA-TGCT 253
29 TCGA-THCA 477
30 TCGA-THYM 111
31 TCGA-UCEC 482
32 TCGA-UCS 50
33 TCGA-UVM 75
Thank you again @yge15 for your help answering questions I very much appreciate it. Posting for others wondering the same thing that indeed like you said TCGA added many WGS samples Dec 2023 and Mar 2024
Hi - given this is to be a comprehensive analysis of TCGA microbial abundances, why are there only 25 cancers analyzed and not all 33 cancers?