waldronlab / curatedTCGAData

Curated Data From The Cancer Genome Atlas (TCGA) as MultiAssayExperiment Objects
https://bioconductor.org/packages/curatedTCGAData
41 stars 7 forks source link

assays of SummarizedExperiments coming as DataFrame (why not matrix?) #31

Closed lgeistlinger closed 3 years ago

lgeistlinger commented 4 years ago
ov.ctd <- curatedTCGAData(diseaseCode="OV", assays="mRNAArray*", dry.run=FALSE)
ov.ctd
A MultiAssayExperiment object of 4 listed
 experiments with user-defined names and respective classes. 
 Containing an ExperimentList class object of length 4: 
 [1] OV_mRNAArray_huex-20160128: SummarizedExperiment with 18632 rows and 575 columns 
 [2] OV_mRNAArray_TX_g4502a_1-20160128: SummarizedExperiment with 17814 rows and 546 columns 
 [3] OV_mRNAArray_TX_g4502a-20160128: SummarizedExperiment with 17814 rows and 31 columns 
 [4] OV_mRNAArray_TX_ht_hg_u133a-20160128: SummarizedExperiment with 12042 rows and 524 columns 
Features: 
 experiments() - obtain the ExperimentList instance 
 colData() - the primary/phenotype DataFrame 
 sampleMap() - the sample availability DataFrame 
 `$`, `[`, `[[` - extract colData columns, subset, or experiment 
 *Format() - convert into a long or wide DataFrame 
 assays() - convert ExperimentList to a SimpleList of matrices
> class(assay(ov.ctd[[1]]))
[1] "DataFrame"
attr(,"package")
[1] "S4Vectors"
> assay(ov.ctd[[1]])[1:5,1:5]
DataFrame with 5 rows and 5 columns
         TCGA-04-1331-01A-01R-0435-03 TCGA-04-1332-01A-01R-0435-03
                            <numeric>                    <numeric>
C9orf152              5.1293974840562             4.03686128489381
ELMO2                 7.3984111313982             7.89992552704923
RPS11                10.4792357981549             10.6598180297217
CREB3L1              6.59814293845376             7.12543317829795
PNMA1                8.68896823329707             8.81371426777574
         TCGA-04-1335-01A-01R-0435-03 TCGA-04-1336-01A-01R-0435-03
                            <numeric>                    <numeric>
C9orf152             6.66831554256062             5.44171019721024
ELMO2                7.50944968310899             6.67400849392619
RPS11                10.2811624348218             10.4212441826243
CREB3L1              6.79374925015387             5.62442668141407
PNMA1                 8.0662892719799             8.59987516862962
         TCGA-04-1337-01A-01R-0435-03
                            <numeric>
C9orf152             3.48762601212778
ELMO2                 8.6307994685934
RPS11                10.2659952571456
CREB3L1              7.11216823651922
PNMA1                8.37578982646659

I would expect those as a numeric matrix. What's the rationale behind providing them as a DataFrame?

LiNk-NY commented 3 years ago

Resolved in data release version 2.0.0 5da7bea5d6e2b316f50f8ba957d0dc98e0c596c8