Closed lovenicole closed 3 years ago
Hi @lovenicole,
Could you post your sessionInfo()
here as well so I know what your R environment looks like. I will look into your issue as soon as I receive it.
Best, Chris
I really appreciate your reply. here's my session information.
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux
Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] PharmacoGx_2.0.9 CoreGx_1.0.2
loaded via a namespace (and not attached):
[1] lsa_0.73.2 bitops_1.0-6 matrixStats_0.57.0 RColorBrewer_1.1-2 GenomeInfoDb_1.24.2
[6] SnowballC_0.7.0 tools_4.0.3 R6_2.5.0 DT_0.16 KernSmooth_2.23-18
[11] sm_2.2-5.6 BiocGenerics_0.34.0 colorspace_1.4-1 tidyselect_1.1.0 gridExtra_2.3
[16] compiler_4.0.3 Biobase_2.48.0 shinyjs_2.0.0 DelayedArray_0.14.1 slam_0.1-47
[21] caTools_1.18.0 scales_1.1.1 relations_0.6-9 stringr_1.4.0 digest_0.6.27
[26] XVector_0.28.0 pkgconfig_2.0.3 htmltools_0.5.0 plotrix_3.7-8 fastmap_1.0.1
[31] limma_3.44.3 maps_3.3.0 htmlwidgets_1.5.2 rlang_0.4.8 rstudioapi_0.11
[36] shiny_1.5.0 visNetwork_2.0.9 generics_0.0.2 jsonlite_1.7.1 BiocParallel_1.22.0
[41] gtools_3.8.2 dplyr_1.0.2 RCurl_1.98-1.2 magrittr_1.5 GenomeInfoDbData_1.2.3
[46] Matrix_1.2-18 Rcpp_1.0.5 celestial_1.4.6 munsell_0.5.0 S4Vectors_0.26.1
[51] lifecycle_0.2.0 piano_2.4.0 stringi_1.5.3 MASS_7.3-53 SummarizedExperiment_1.18.2
[56] zlibbioc_1.34.0 gplots_3.1.0 plyr_1.8.6 grid_4.0.3 parallel_4.0.3
[61] promises_1.1.1 shinydashboard_0.7.1 crayon_1.3.4 lattice_0.20-41 mapproj_1.2.7
[66] pillar_1.4.6 fgsea_1.14.0 tcltk_4.0.3 igraph_1.2.6 GenomicRanges_1.40.0
[71] reshape2_1.4.4 marray_1.66.0 stats4_4.0.3 fastmatch_1.1-0 NISTunits_1.0.1
[76] glue_1.4.2 downloader_0.4 data.table_1.13.2 BiocManager_1.30.10 vctrs_0.3.4
[81] httpuv_1.5.4 testthat_2.3.2 gtable_0.3.0 RANN_2.6.1 purrr_0.3.4
[86] ggplot2_3.3.2 mime_0.9 xtable_1.8-4 pracma_2.2.9 later_1.1.0.1
[91] tibble_3.0.4 IRanges_2.22.2 sets_1.0-18 cluster_2.1.0 ellipsis_0.3.1
[96] magicaxis_2.0.10
Actually, I solved this problem. Maybe it was because i needed to add more details to the molecularProfiles
slot like this, `molecularProfiles = list(rna = phase_I_eset_cp_2)'.
so i'm doing the next step. but i'm faced with another problem when using drugPerturbationSig
function.
When I run this function like this drug.perturbation <- drugPerturbationSig(CMAP_2020_cp, mDataType = "rna", verbose = F)
, then the error come out, saying
Error in names(ubatch) <- paste("batch", ubatch, sep = "") : 'names' attribute [1] must be the same length as the vector [0]
Two things I'm suspicious about are the drug
slot when creating pharmacoset
and the batchid
setting.
First, the pdf file I attached before says that the drugid
should be left as NA
for controls and the rownames
of the drug
slot should be the same as the drugid
. I'm confused about it because there are so many controls in my data so I can't set the controls' rownames
as NA
in the drug slot.
Second, the batchid
thing. My data is already the batch-effect-corrected data and doesn't have any batchid
. So I just fill in the batchid
column with NA
, following the instruction of the pdf file I attached before. Is this right? or do I rather have to fill this with "perturbation"
and "control"
(<- it's not working, either :( )
Could you help me with this problem? Do I have to open another issue about it?
Yes I can help you, no worries.
I think instead of NA you should set all batch ids to 1. This should resolve the error, if not please comment here.
Also, the vignette you shared seems to be out of date. Please see the most recent on Bioconductor at: https://www.bioconductor.org/packages/release/bioc/vignettes/PharmacoGx/inst/doc/CreatingPharmacoSet.pdf.
It doesn't work, although I set all batch ids to 1. Do you have any other solutions?
Anyway, thank you for the new vignette. I read it and it is really helpful.
when I set the ubatch
object as Named num 1
(name == "batch1) and eliminate the code names(ubatch) <- paste("batch", ubatch, sep = "")
in the rankGeneDrugPerturbation
, the function which is included in the drugPerturbationSig
function, then the error below does not appear again.
Error in names(ubatch) <- paste("batch", ubatch, sep = "") : 'names' attribute [1] must be the same length as the vector [0]
However, another error below keeps popping up.
Error in mcfork() : unable to fork, possible reason: Cannot allocate memory
Is there any function to reduce the size of the data and run the funtion drugPerturbationSig
with it?
or is there a method to separate the Pharmacoset
according to the cell lines, ran the drugPerturbationSig
separately and integrate them into one perturbation signature (before calculating the connectivityscore
)?
or can I just run the function connectivityscore
with z-score, not the drug.perturbation
object? because there is a z-score file (level 5 data. a signature) available in GSE92742.
Hi @lovenicole,
From your sessionInfo() it looks like you are using an old version of PharmacoGx. As a start, please install the most recent release with BiocManager::install('PharmacoGx', version='3.12')
. When this is done try rerunning the drugPerturbationSig
function and see if the error persists.
If you want to subset the PSet before running drugPerturbationSig
, you can do so with the subsetTo
function, see ?subsetTo
. Alternatively, you can specify the cells
and/or drugs
argument to drugPerturbationSig
, which will do the subset before running. To merge the results, you will need to use the abind
function. The memory used by the drugPerturbationSig
function will also be proportional the the number of threads specified to nthread
parameter.
I would recommend subsetting on drug instead of cell line if your memory issues persist, as the resulting subset will be much smaller. You could do something like:
results <- list()
for (drug in drugNames(CMAP)) {
results[[drug]] <- drugPerturbationSig(CMAP, drugs=drug, nthread=2)
}
Best, Chris
Even though I updated the package, the same error keeps popping up.
Error in names(ubatch) <- paste("batch", ubatch, sep = "") : 'names' attribute [1] must be the same length as the vector [0]
I didn't finish the work yet, but thank you in advance! :) I really appreciate your kindness, although my English is not good.
Hi @lovenicole,
Happy to help. We are aware that the drugPerturbationSig function is slow and have some updates planned for it in the coming months.
In the mean time, I am glad it is now working for you.
Please don't hesitate to open another issue if you have questions or concerns.
Best, Chris
hi, I'm trying to create a pharmacoset with the latest updated LINCS data, since the current version of connectivity map pharmacoset is 2016. but I keep getting an error at the last step; using the
PharmacoSet
function.I followed the instructions described on the pdf file attached below. CreatingPharmacoSet.pdf
and I made all of the dataframes and expressionset. (here's my global environment)
but I got an error message without any details.
CMAP_2020 <- PharmacoSet("CMAP", molecularProfiles = phase_I_eset, cell = cell, drug = drug, curationDrug = curation_drug, curationCell = curation_cell, curationTissue = curation_tissue, datasetType = "perturbation", verify = T)
Can anyone help me?