saezlab / dorothea

R package to access DoRothEA's regulons
https://saezlab.github.io/dorothea/
GNU General Public License v3.0
133 stars 27 forks source link

Using ExpressionSet data objects from GEO #35

Closed allcatsaregrey closed 3 years ago

allcatsaregrey commented 3 years ago

I have generated an appropriate PKN using OmnipathR and imported an associated RNAseq (for example https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28392) but I encounter the following error

Error in wts^2 : non-numeric argument to binary operator

I am unsure of what processing of the ExpressionSet object is necessary.

christianholland commented 3 years ago

Hi @allcatsaregrey,

with the provided information its hard to tell what causes the error.

One possible issue could be that the format of the PKN from Omnipath is not in the correct format for the run_viper() function. Is there a specific reason why you use the regulons from omnipath and not the implemented regulons within the dorothea package?

allcatsaregrey commented 3 years ago

Hello, sorry for the vague question. I am trying to generate a PKN and tf activities from dorothea to run with CARNIVAL. I am grabbing a dataset from GEO in the ExpressionSet format using GEOQuery My code is as follows.

user_dat <- getGEO("GSE28392", GSEMatrix = TRUE)

# Select regulons with high confidence values of "A" and "B"
data(dorothea_hs)
regulons <- dorothea_hs %>% filter(confidence %in% c("A", "B"))

# Bulk RNAseq data can be processed this way
tf_activities <- run_viper(user_dat[[1]], regulons, 
                           options =  list(method = "scale", minsize = 4, 
                                           eset.filter = FALSE, cores = 1, 
                                           verbose = FALSE)) %>% 
  as(tf_activities, "data.frame")

sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-redhat-linux-gnu (64-bit)
#> Running under: Fedora 33 (Workstation Edition)
#> 
#> Matrix products: default
#> BLAS/LAPACK: /usr/lib64/libflexiblas.so.3.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8    
#>  [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
#>  [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> </details>
#> loaded via a namespace (and not attached):
#>  [1] compiler_4.0.3  magrittr_2.0.1  tools_4.0.3     htmltools_0.5.0
#>  [5] yaml_2.2.1      stringi_1.5.3   rmarkdown_2.6   highr_0.8      
#>  [9] knitr_1.30      stringr_1.4.0   xfun_0.19       digest_0.6.27  
#> [13] rlang_0.4.9     evaluate_0.14
christianholland commented 3 years ago

Thanks for the example, this helps to traceback the error. The issue is that your ExpressionSet object user_dat contains probe IDs as features/row names (e.g. 1405_i_at). You need to annotate those IDs with human gene symbols (HGNC)

Cheers, Christian