Closed alisman closed 1 month ago
There's also discrepancy with NA counts for most of the bar charts.
Legacy:
CH:
Looks like this happens only when we combine studies.
For example, OS_MONTHS_INIT_DIAGNOSIS
is only in mbc_genie_2020
, and for that study the NA count is zero both in legacy and the CH implementation. (https://genie-public-beta.cbioportal.org/study/summary?id=mbc_genie_2020 vs https://genie-public-beta.cbioportal.org/study/summary?id=mbc_genie_2020?legacy=1)
I think for combined studies we ignore samples from other studies when counting NA
s, because they don't have OS_MONTHS_INIT_DIAGNOSIS
clinical data. Looks like we only take mbc_genie_2020
samples into account and ignore the rest.
The legacy implementation, on the other hand, takes all samples from all studies into account when counting NA
s.
https://genie-public-beta.cbioportal.org/study/summary?id=brca_akt1_genie_2019%2Claml_tcga_pan_can_atlas_2018%2Cacc_tcga_pan_can_atlas_2018%2Cblca_tcga_pan_can_atlas_2018%2Clgg_tcga_pan_can_atlas_2018%2Cbrca_tcga_pan_can_atlas_2018%2Ccesc_tcga_pan_can_atlas_2018%2Cchol_tcga_pan_can_atlas_2018%2Ccoadread_tcga_pan_can_atlas_2018%2Cglioma_dfci_2020%2Cdlbc_tcga_pan_can_atlas_2018%2Cerbb2_genie_public%2Cesca_tcga_pan_can_atlas_2018%2Ccrc_public_genie_bpc%2Cnsclc_public_genie_bpc%2Cgenie_public%2Cgbm_tcga_pan_can_atlas_2018%2Chnsc_tcga_pan_can_atlas_2018%2Ckich_tcga_pan_can_atlas_2018%2Ckirc_tcga_pan_can_atlas_2018%2Ckirp_tcga_pan_can_atlas_2018%2Clihc_tcga_pan_can_atlas_2018%2Cluad_tcga_pan_can_atlas_2018%2Clusc_tcga_pan_can_atlas_2018%2Cmeso_tcga_pan_can_atlas_2018%2Cmbc_genie_2020%2Cov_tcga_pan_can_atlas_2018%2Cpaad_tcga_pan_can_atlas_2018%2Cpcpg_tcga_pan_can_atlas_2018%2Cprad_tcga_pan_can_atlas_2018%2Csarc_tcga_pan_can_atlas_2018%2Cskcm_tcga_pan_can_atlas_2018%2Cstad_tcga_pan_can_atlas_2018%2Ctgct_tcga_pan_can_atlas_2018%2Cthym_tcga_pan_can_atlas_2018%2Cthca_tcga_pan_can_atlas_2018%2Cucs_tcga_pan_can_atlas_2018%2Cucec_tcga_pan_can_atlas_2018%2Cuvm_tcga_pan_can_atlas_2018