bhklab / PharmacoGx

R package to analyze large-scale pharmacogenomic datasets.
http://pharmacodb.pmgenomics.ca
GNU General Public License v3.0
64 stars 26 forks source link

verification error from checkPsetStructure(pSet) #142

Open gabora opened 1 year ago

gabora commented 1 year ago

Hi

I get errors during the verification of my PSet object, which I am not sure how to resolve. Please find below a dummy example that reproduces the error.

Note that I have some pending PRs #140 and #137, bhklab/CoreGX#168 that are already incorporated in my install.

thanks for the help, best, attila

library(dplyr)

# setup some dummy data
sampleid = c("ASPC_DMSO_5_24...72",
             "ASPC_DMSO_0_24...73",
             "ASPC_REGORAFENIB_0_24...74",
             "ASPC_REGORAFENIB_0_24...75",
             "ASPC_REGORAFENIB_0.345_24...76")
entrezid = c("100287102", "102466751", "100302278", "645520", "79501" )
treatmentid = c("DMSO","DMSO", "REGORAFENIB","REGORAFENIB","REGORAFENIB")
cellid = rep("ASPC",5)

rna <- round(runif(25,0,1000)) %>% matrix(nrow= 5)
colnames(rna) <- sampleid
rownames(rna) <- entrezid

columndata = data.frame(sampleid = sampleid,
                      cell_line = rep("ASPC",5),
                      treatmentid = treatmentid,
                      dose = c(0, 0, 10, 100, 1000))
metadata = data.frame(sampleid =  sampleid, geo_accession = rep("none",5))

SE <- SummarizedExperiment::SummarizedExperiment(assays = list(exprs = rna),                                                                                         
                                                 colData = columndata,metadata = metadata)

## Create sample 
sample = columndata
rownames(sample) = columndata$sampleid

## Create treatment 

treatment = data.frame(sampleid = sampleid, treatmentid  = treatmentid) 

## PSet
pset = PharmacoGx::PharmacoSet(    name = "Toy",
                                   molecularProfiles = list(exprs = SE),
                                   sample= sample,
                                   treatment = treatment,
                                   #curationSample=curationSample,
                                   #curationTissue=curationTissue,
                                   #perturbationN = treatment,
                                   #curationTreatment=curationTreatment,
                                   datasetType = "perturbation",
                                   verify= TRUE
)

results in

Error in if (metadata(profile)$annotation == "rna" || metadata(profile)$annotation == : missing value where TRUE/FALSE needed

I guess the issue is that the metadata should have column called annotation, but that also leads to an error:

metadata$annotation = "rna"
SE <- SummarizedExperiment::SummarizedExperiment(assays = list(exprs = rna),                                                                                         
                                                 colData = columndata,metadata = metadata)
pset = PharmacoGx::PharmacoSet(    name = "Toy",
                                   molecularProfiles = list(exprs = SE),
                                   sample= sample,
                                   treatment = treatment,
                                   #curationSample=curationSample,
                                   #curationTissue=curationTissue,
                                   perturbationN = treatment,
                                   #curationTreatment=curationTreatment,
                                   datasetType = "perturbation",
                                   verify= TRUE
)

Error in if (S4Vectors::metadata(profile)$annotation == "rna" | S4Vectors::metadata(profile)$annotation == : the condition has length > 1

# Output of sessionInfo()

sessionInfo() R version 4.2.0 (2022-04-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.7

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] dplyr_1.1.0

loaded via a namespace (and not attached): [1] lsa_0.73.3 bitops_1.0-7 matrixStats_0.63.0 RColorBrewer_1.1-3 GenomeInfoDb_1.34.9 BumpyMatrix_1.6.0
[7] SnowballC_0.7.0 tools_4.2.0 backports_1.4.1 utf8_1.2.3 R6_2.5.1 DT_0.27
[13] sm_2.2-5.7.1 KernSmooth_2.23-20 BiocGenerics_0.44.0 colorspace_2.1-0 tidyselect_1.2.0 compiler_4.2.0
[19] cli_3.6.0 Biobase_2.58.0 shinyjs_2.1.0 DelayedArray_0.24.0 slam_0.1-50 caTools_1.18.2
[25] scales_1.2.1 bench_1.1.2 checkmate_2.1.0 relations_0.6-12 stringr_1.5.0 digest_0.6.31
[31] CoreGx_2.3.2 XVector_0.38.0 pkgconfig_2.0.3 htmltools_0.5.4 PharmacoGx_3.3.2 plotrix_3.8-2
[37] MatrixGenerics_1.10.0 fastmap_1.1.1 limma_3.54.1 maps_3.4.1 htmlwidgets_1.6.1 rlang_1.0.6
[43] rstudioapi_0.14 shiny_1.7.4 visNetwork_2.1.2 generics_0.1.3 jsonlite_1.8.4 BiocParallel_1.32.5
[49] gtools_3.9.4 RCurl_1.98-1.10 magrittr_2.0.3 GenomeInfoDbData_1.2.9 Matrix_1.5-3 Rcpp_1.0.10
[55] celestial_1.4.6 munsell_0.5.0 S4Vectors_0.36.1 fansi_1.0.4 lifecycle_1.0.3 stringi_1.7.12
[61] piano_2.14.0 MASS_7.3-58.1 SummarizedExperiment_1.28.0 zlibbioc_1.44.0 plyr_1.8.8 gplots_3.1.3
[67] grid_4.2.0 parallel_4.2.0 promises_1.2.0.1 shinydashboard_0.7.2 crayon_1.5.2 lattice_0.20-45
[73] cowplot_1.1.1 mapproj_1.2.11 pillar_1.8.1 fgsea_1.24.0 igraph_1.4.1 GenomicRanges_1.50.2
[79] boot_1.3-28.1 reshape2_1.4.4 codetools_0.2-18 marray_1.76.0 stats4_4.2.0 fastmatch_1.1-3
[85] NISTunits_1.0.1 glue_1.6.2 downloader_0.4 data.table_1.14.8 MultiAssayExperiment_1.24.0 vctrs_0.5.2
[91] httpuv_1.6.9 gtable_0.3.1 RANN_2.6.1 ggplot2_3.4.1 mime_0.12 xtable_1.8-4
[97] coop_0.6-3 pracma_2.4.2 later_1.3.0 tibble_3.1.8 IRanges_2.32.0 sets_1.0-22
[103] cluster_2.1.4 ellipsis_0.3.2 magicaxis_2.2.14