Open sourcesync opened 1 year ago
I found it in this paper ( see quote below if anyone else is interested. ). You can close this issue.
METABRIC: The Molecular Taxonomy of Breast Cancer International Consortium
(METABRIC) is a clinical dataset which consists of gene expressions used to determine different subgroups of breast cancer. We consider the data for 1,904 patients
with each patient having 9 covariates - 4 gene indicators (MKI67, EGFR, PGR, and
ERBB2) and 5 clinical features (hormone treatment indicator, radiotherapy indicator,
chemotherapy indicator, ER-positive indicator, age at diagnosis). Furthermore, out
of the total 1,904 patients, 801 (42.06%) are right-censored, and the rest are deceased
(event). We obtained the DAG as depicted in Fig. 3 using a modified DAG-GNN
algorithm.
Apologies for the newbie question. It seems that the original METABRIC dataset has many more "factors" than the 8 covariates in your dataset.
Which "factors" did you choose for the version of the dataset available in this package?
Thanks.