lgatto / pRoloc

A unifying bioinformatics framework for organelle proteomics
http://lgatto.github.io/pRoloc/
15 stars 14 forks source link

Error in estep #149

Closed LojzaZ closed 4 months ago

LojzaZ commented 4 months ago

Hi, I am trying to run the phenoDisco but I get the following error:

Error

Iteration 1
Error: BiocParallel errors
  6 remote errors, element index: 1, 18, 35, 52, 69, 86
  94 unevaluated and other errors
  first remote error:
Error in estep(data = structure(c(-3.41993620164489, -0.302834260612289, : could not find function "estep"

I understand that I am missing the function estep. But I do not know what package it should come from, plus from the example site (https://lgatto.github.io/pRoloc/articles/v01-pRoloc-tutorial.html), I assume that only pRoloc and MSnbase are needed to run this and these I have so I am confused. I am also loading the package stats because if I do not there is also a warning that stats may not be available.

The simplified code:

library(MSnbase)
library(pRoloc)
library(stats)

# load data = works
data <- readMSnSet(exprsFile = outQuan, #path to the data tsv file
                   featureDataFile = outMeta, #path to the data tsv file
                   phenoDataFile = outFrac, #path to the data tsv file
                   sep = "\t")

# predict the localization = works
svmres <- svmClassification(data, 
                            fcol = "markers",
                            sigma = 0.1
                            cost = 16)

# get localizations with given score = works
pred_filtered <- getPredictions(svmres, fcol = "svm", t = 0.5)

# run semiSup method = error
pheno_pred <- phenoDisco(pred_filtered,
                         GS = 10,
                         times = 100,
                         fcol = "svm.pred")

Session info:

R version 4.4.0 (2024-04-24 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default

locale:
[1] LC_COLLATE=Czech_Czechia.utf8  LC_CTYPE=Czech_Czechia.utf8    LC_MONETARY=Czech_Czechia.utf8
[4] LC_NUMERIC=C                   LC_TIME=Czech_Czechia.utf8    

time zone: Europe/Prague
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pRoloc_1.44.0        BiocParallel_1.38.0  MLInterfaces_1.84.0  cluster_2.1.6        annotate_1.82.0     
 [6] XML_3.99-0.16.1      AnnotationDbi_1.66.0 IRanges_2.38.0       MSnbase_2.30.1       ProtGenerics_1.36.0 
[11] S4Vectors_0.42.0     mzR_2.38.0           Rcpp_1.0.12          Biobase_2.64.0       BiocGenerics_0.50.0 

loaded via a namespace (and not attached):
  [1] splines_4.4.0               filelock_1.0.3              tibble_3.2.1                cellranger_1.1.0           
  [5] hardhat_1.4.0               preprocessCore_1.66.0       pROC_1.18.5                 rpart_4.1.23               
  [9] lifecycle_1.0.4             httr2_1.0.1                 doParallel_1.0.17           globals_0.16.3             
 [13] lattice_0.22-6              MASS_7.3-60.2               MultiAssayExperiment_1.30.2 dendextend_1.17.1          
 [17] magrittr_2.0.3              limma_3.60.2                plotly_4.10.4               MsCoreUtils_1.16.0         
 [21] DBI_1.2.3                   RColorBrewer_1.1-3          lubridate_1.9.3             abind_1.4-5                
 [25] zlibbioc_1.50.0             GenomicRanges_1.56.0        purrr_1.0.2                 mixtools_2.0.0             
 [29] AnnotationFilter_1.28.0     nnet_7.3-19                 rappdirs_0.3.3              ipred_0.9-14               
 [33] lava_1.8.0                  GenomeInfoDbData_1.2.12     listenv_0.9.1               parallelly_1.37.1          
 [37] ncdf4_1.22                  codetools_0.2-20            DelayedArray_0.30.1         xml2_1.3.6                 
 [41] tidyselect_1.2.1            UCSC.utils_1.0.0            viridis_0.6.5               matrixStats_1.3.0          
 [45] BiocFileCache_2.12.0        jsonlite_1.8.8              caret_6.0-94                e1071_1.7-14               
 [49] survival_3.5-8              iterators_1.0.14            foreach_1.5.2               segmented_2.1-0            
 [53] tools_4.4.0                 progress_1.2.3              snow_0.4-4                  glue_1.7.0                 
 [57] prodlim_2023.08.28          gridExtra_2.3               SparseArray_1.4.8           xfun_0.44                  
 [61] MatrixGenerics_1.16.0       GenomeInfoDb_1.40.1         dplyr_1.1.4                 withr_3.0.0                
 [65] BiocManager_1.30.23         fastmap_1.2.0               fansi_1.0.6                 digest_0.6.35              
 [69] timechange_0.3.0            R6_2.5.1                    colorspace_2.1-0            gtools_3.9.5               
 [73] lpSolve_5.6.20              biomaRt_2.60.0              RSQLite_2.3.7               utf8_1.2.4                 
 [77] tidyr_1.3.1                 generics_0.1.3              hexbin_1.28.3               data.table_1.15.4          
 [81] recipes_1.0.10              FNN_1.1.4                   class_7.3-22                prettyunits_1.2.0          
 [85] PSMatch_1.8.0               httr_1.4.7                  htmlwidgets_1.6.4           S4Arrays_1.4.1             
 [89] ModelMetrics_1.2.2.2        pkgconfig_2.0.3             gtable_0.3.5                timeDate_4032.109          
 [93] blob_1.2.4                  impute_1.78.0               XVector_0.44.0              htmltools_0.5.8.1          
 [97] MALDIquant_1.22.2           clue_0.3-65                 scales_1.3.0                png_0.1-8                  
[101] gower_1.0.1                 knitr_1.47                  rstudioapi_0.16.0           reshape2_1.4.4             
[105] coda_0.19-4.1               nlme_3.1-164                curl_5.2.1                  proxy_0.4-27               
[109] cachem_1.1.0                stringr_1.5.1               parallel_4.4.0              mzID_1.42.0                
[113] vsn_3.72.0                  pillar_1.9.0                grid_4.4.0                  vctrs_0.6.5                
[117] pcaMethods_1.96.0           randomForest_4.7-1.1        dbplyr_2.5.0                xtable_1.8-4               
[121] mvtnorm_1.2-5               cli_3.6.2                   compiler_4.4.0              rlang_1.1.3                
[125] crayon_1.5.2                future.apply_1.11.2         LaplacesDemon_16.1.6        mclust_6.1.1               
[129] QFeatures_1.14.1            affy_1.82.0                 plyr_1.8.9                  stringi_1.8.4              
[133] viridisLite_0.4.2           munsell_0.5.1               Biostrings_2.72.1           lazyeval_0.2.2             
[137] Matrix_1.7-0                hms_1.1.3                   bit64_4.0.5                 future_1.33.2              
[141] ggplot2_3.5.1               KEGGREST_1.44.0             statmod_1.5.0               SummarizedExperiment_1.34.0
[145] kernlab_0.9-32              igraph_2.0.3                memoise_2.0.1               affyio_1.74.0              
[149] sampling_2.10               bit_4.0.5                   readxl_1.4.3               
lgatto commented 4 months ago

Hi @LojzaZ - that you for report this issue. The estep function comes from the mclust package, that is already installed (it's listed in your session information). I can reproduce the error by running the phenoDisco example. I will investigate.

Also tagging @lmsimp.

lgatto commented 4 months ago

I found the issue and will fix the bug. In the meantime, simply loading and attaching mclust will fix the error:

library(pRoloc)
library(mclust)

## your code

phenoDisco(...)
LojzaZ commented 4 months ago

Great, thank you very much! Unfortunatelly, even if I attach the mclust package I still get the error. Let's see if its gonna work when you fix the bug.

lgatto commented 4 months ago

Oh, that's annoying, as my fix would be the programmatic equivalent of loading mclust. If you haven't done so, could you restart R, load the packages, and try again, please.

LojzaZ commented 4 months ago

Yes I did and still the same error, I can try installing all the packages on metacentrum since I am planning to run it there anyway (now I am just trying if it runs on my laptop), and see if I get the error.

lgatto commented 4 months ago

Not sure what metacentrum is.

If you want to re-install packages, you can install the (hopefully) fixed version with

BiocManager::install("lgatto/pRoloc")

It's weird, as phenoDisco() runs on my computer after restarting, loading pRoloc and mclust, and running the example code in ?phenoDisco. Could you try that.

lgatto commented 4 months ago

Here's the exact code (minus the package startup messages) that worked, with the original pRoloc version

> library(pRoloc)
> library(pRolocdata)
> data(tan2009r1)
> pdres <- phenoDisco(tan2009r1, fcol = "PLSDA")
Iteration 1
Stop worker failed with the error: reached CPU time limit
Error: BiocParallel errors
  2 remote errors, element index: 1, 11
  98 unevaluated and other errors
  first remote error:
Error in estep(data = structure(c(1.9527019824844, 2.02054612420664, 1.79267821932411, : could not find function "estep"
> library(mclust)
                   __           __ 
   ____ ___  _____/ /_  _______/ /_
  / __ `__ \/ ___/ / / / / ___/ __/
 / / / / / / /__/ / /_/ (__  ) /_  
/_/ /_/ /_/\___/_/\__,_/____/\__/   version 6.1.1
Type 'citation("mclust")' for citing this R package in publications.
> pdres <- phenoDisco(tan2009r1, fcol = "PLSDA")
Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Iteration 6
Iteration 7
Iteration 8
Iteration 9
Iteration 10
Iteration 11
lgatto commented 4 months ago

I have confirmed that with the latest version of the package, available with BiocManager::install("lgatto/pRoloc") (there have been a couple of udpates, the latest being around noon), the code in ?phenoDisco works.

Thanks again for reporting the problem.I'm closing the issue now, but feel free to re-open it if you still run into the same problem with estep, estepEEE, ... not being available.

LojzaZ commented 4 months ago

Thanks, I'll try to reinstall and will see.