saezlab / cosmosR

COSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets.
https://saezlab.github.io/cosmosR/
GNU General Public License v3.0
58 stars 15 forks source link

Saving results... Error in file(file, "rt") : cannot open the connection #17

Closed wmoldham closed 2 years ago

wmoldham commented 2 years ago

When running run_COSMOS_metabolism_to_signaling, after a lengthy calculation it errors out with the following message:

Saving results...
Error in file(file, "rt") : cannot open the connection
In addition: Warning messages:
1: In preprocessPriorKnowledgeNetwork(priorKnowledgeNetwork) :
  self loop(s) detected and removed from prior knowledge network.
2: In file(file, "rt") : 
 Error in file(file, "rt") : cannot open the connection

The result of the traceback is:

13.  file(file, "rt")
12.  read.table(file = file, header = header, sep = sep, quote = quote, dec = dec, fill = fill, comment.char = comment.char, ...)
11.  read.delim(file = solutionFileName)
10.  solversFunctions$solve(carnivalOptions)
9.  sendTaskToSolver(preparedForRun$variables, dataPreprocessed, carnivalOptions)
8.  solveCarnival(dataPreprocessed, carnivalOptions)
7.  CARNIVAL::runVanillaCarnival(perturbations = input_data, measurements = measured_data, priorKnowledgeNetwork = network, carnivalOptions = options)
6.  runCARNIVAL_wrapper(network = data$meta_network, input_data = disc_metabolic_data, measured_data = data$signaling_data, options = CARNIVAL_options)
5.  cosmosR::run_COSMOS_metabolism_to_signaling(reverse, carnival_options)
4.  as.data.frame(cosmos_res$weightedSIF)
3.  cosmosR::format_COSMOS_res(.)
2.  cosmosR::run_COSMOS_metabolism_to_signaling(reverse, carnival_options) %>% 
cosmosR::format_COSMOS_res() at 1_cosmos.R#125
1.  run_reverse(one, two)

I am successfully able to generate the network from the vignette using the functions that encapsulate this workflow. I have tried with all cores and with few cores, but not yet with 1 core with the same result. I have tried setting the workdir and outputFolder to different locations. I believe I am running both CARNIVAL and cosmosR versions from their current github repositories. Thank you for any advice that you may have in sorting out this issue and thank you for making this resource available to the research community!

Session Info ``` R version 4.1.2 (2021-11-01) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.3 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] McGarrity.2022.hypoxia.omics_0.0.0.9000 targets_0.10.0 [3] cosmosR_1.99.1 testthat_3.1.2 [5] devtools_2.4.3 usethis_2.1.5 loaded via a namespace (and not attached): [1] bitops_1.0-7 fs_1.5.2 bit64_4.0.5 [4] RColorBrewer_1.1-2 progress_1.2.2 httr_1.4.2 [7] rprojroot_2.0.2 GenomeInfoDb_1.30.1 backports_1.4.1 [10] tools_4.1.2 utf8_1.2.2 R6_2.5.1 [13] DBI_1.1.2 BiocGenerics_0.40.0 colorspace_2.0-3 [16] withr_2.4.2 tidyselect_1.1.2 prettyunits_1.1.1 [19] processx_3.5.2 bit_4.0.4 compiler_4.1.2 [22] cli_3.1.1 Biobase_2.54.0 CARNIVAL_2.5.1 [25] desc_1.4.0 stringfish_0.15.5 bookdown_0.24 [28] scales_1.1.1 readr_2.1.2 callr_3.7.0 [31] stringr_1.4.0 digest_0.6.29 rmarkdown_2.13 [34] XVector_0.34.0 pkgconfig_2.0.3 htmltools_0.5.2 [37] sessioninfo_1.2.2 bcellViper_1.30.0 fastmap_1.1.0 [40] htmlwidgets_1.5.4 rlang_1.0.1 rstudioapi_0.13 [43] RSQLite_2.2.10 visNetwork_2.1.0 RApiSerialize_0.1.0 [46] generics_0.1.2 jsonlite_1.8.0 GOSemSim_2.20.0 [49] dplyr_1.0.8 RCurl_1.98-1.6 magrittr_2.0.2 [52] GO.db_3.14.0 GenomeInfoDbData_1.2.7 Rcpp_1.0.8.2 [55] munsell_0.5.0 S4Vectors_0.32.3 fansi_1.0.2 [58] lifecycle_1.0.1 yaml_2.3.5 stringi_1.7.6 [61] zlibbioc_1.40.0 brio_1.1.3 pkgbuild_1.3.1 [64] grid_4.1.2 blob_1.2.2 crayon_1.4.2 [67] Biostrings_2.62.0 hms_1.1.1 KEGGREST_1.34.0 [70] knitr_1.37 dorothea_1.6.0 ps_1.6.0 [73] pillar_1.7.0 igraph_1.2.11 rjson_0.2.21 [76] base64url_1.4 codetools_0.2-18 lpSolve_5.6.15 [79] stats4_4.1.2 pkgload_1.2.4 glue_1.6.1 [82] evaluate_0.15 RcppParallel_5.1.5 data.table_1.14.2 [85] remotes_2.4.2 renv_0.15.4 BiocManager_1.30.16 [88] png_0.1-7 vctrs_0.3.8 tzdb_0.2.0 [91] gtable_0.3.0 purrr_0.3.4 tidyr_1.2.0 [94] qs_0.25.3 assertthat_0.2.1 cachem_1.0.6 [97] ggplot2_3.3.5 conflicted_1.1.0 xfun_0.30 [100] tibble_3.1.6 AnnotationDbi_1.56.2 memoise_2.0.1 [103] IRanges_2.28.0 ellipsis_0.3.2 ```
gabora commented 2 years ago

Hi @wmoldham thanks for the report, could you please copy here the carnival options (carnival_options) that you used to run the above code? Based on the above info, I suspect you use CPLEX. CPLEX writes some diagnostics to your screen. I am also wondering what was the last line, something similar to this: CPLEX> Solution pool written to file './test_model1/cplex1//result_t09_16_48d24_03_2022n99.txt'.

thanks

wmoldham commented 2 years ago

Sorry! Here they are:

path <- "analysis/carnival"
carnival_options <- cosmosR::default_CARNIVAL_options(solver = "cplex")
carnival_options$solverPath <- "/Applications/CPLEX_Studio201/cplex/bin/x86-64_osx/cplex"
carnival_options$outputFolder <- path
carnival_options$workdir <- path
carnival_options$mipGAP <- 0.05
carnival_options

Of course I had kicked off a couple of different attempts overnight that all ran through so I can't copy that last CPLEX line for you. The cplexCommand file for a run that didn't work included the write output line write analysis/carnival//result_t11_12_48d23_03_2022n69.txt sol all. I will follow up with this information if another attempt fails.

gabora commented 2 years ago

Ah! OK. So at least some are running fine! Are you running many instances at the same time? (many cosmos runs parallel or with very small networks one after the other?)

it happened to me with very small networks (used in tests), that 2 CARNIVAL runs got the same filename. This happened very rarely: even though the file names are generated based on the current time (up to seconds) plus a random 2-digit number is added to it, it can happen. if this is a case, then I will check again how the filenames are generated and if we can do better.

wmoldham commented 2 years ago

I have been running a single analysis at a time, but typically using multiple threads. The networks are pretty large and usually (and infuriatingly) the save issue happens after several hours. The successful run was also with a different dataset. I am restarting some more analyses and will update tonight when I see how things work out. Very grateful for your response and assistance!

gabora commented 2 years ago

@ivanovaos helped me out now. With large networks we usually run the optimization on a cluster because we would run out of memory (and swap) on our local computers. Actually, it happened to me that my job on the cluster was killed, because CPLEX used more than 120 GB (!) of memory. We suspect that something similar happens to you, too.

If the optimization goes well, you should see the following output: image

I highlighted the line in which CPLEX reports where the results are written out.

Your original error is telling us that this file is not located there, which can only happen if CPLEX was somehow interrupted and the file was not written. (sorry we don't detect that yet, added to my todo)

So next time it fails, please try to locate the result file on your computer. I think it should not be there.

So how to solve this issue?

in the preprocessing step, there is a parameter maximum_network_depth, which you could reduce from 15. Any node which is not reachable from the layers in maximum_network_depth steps, will be removed. Try with 5 - 8.

I would only increase this network_depth if I cannot connect the nodes from inputs to outputs.

Please let us know if this helped

wmoldham commented 2 years ago

Ha! Yes, I think that must be what is happening. I can confirm that the result file is never written into the directory. Reviewing the log files from the past few uncompleted runs suggests I hit ~ 180 Gb nodefile size before it crashed (~120 Gb compressed). I will do as you suggest and reduce the network depth to something that can be managed on my hardware and report back. This also makes sense why I'm able to complete the preprocessing steps and toy model without difficulty.

gabora commented 2 years ago

ok, thanks for the quick report! Hope this solves the issue. I will add some meaningful error message if the file is not found.

jmj21-ic commented 2 years ago

Hi there,

I am experiencing this same issue while using CARNIVAL in the context of PHONEMeS. Do you have any suggestions for a similar fix I might be able to use please?

CPLEX> Saving results...
Error in file(file, "rt") : cannot open the connection
In addition: Warning messages:
1: In dir.create(carnivalOptions$workdir, recursive = TRUE) :
  cannot create dir '', reason 'No such file or directory'
2: In dir.create(carnivalOptions$outputFolder, recursive = TRUE) :
  cannot create dir '', reason 'No such file or directory'
3: In file(file, "rt") :
  cannot open file '/rds/general/user/jmj21/home/Project2//result_t17_17_10d22_04_2022n59.txt': No such file or directory

Traceback:

9: file(file, "rt")
8: read.table(file = file, header = header, sep = sep, quote = quote, 
       dec = dec, fill = fill, comment.char = comment.char, ...)
7: read.delim(file = solutionFileName)
6: solversFunctions$solve(carnivalOptions)
5: sendTaskToSolver(preparedForRun$variables, dataPreprocessed, 
       carnivalOptions)
4: solveCarnival(dataPreprocessed, carnivalOptions)
3: runVanillaCarnival(perturbations = inputObj, measurements = measObj, 
       priorKnowledgeNetwork = netObj, weights = weightObj, carnivalOptions = opts)
2: CARNIVAL::runCARNIVAL(inputObj = inputObj, measObj = measObj, 
       netObj = netObj, solverPath = solverPath, solver = solver, 
       timelimit = timelimit, mipGAP = mipGAP, poolrelGAP = poolrelGAP, 
       dir_name = dir_name)
1: PHONEMeS::run_phonemes(inputObj = deregulated_kinases, measObj = deregulated_pps, 
       rmNodes = nc_kinases, netObj = phonemesPKN, solverPath = "/apps/cplex/20.1/cplex/bin/x86-64_linux/cplex", 
       solver = "cplex")

Many thanks,

Jess

gabora commented 2 years ago

Hi Jess,

Could you please confirm that you use the latest CARNIVAL (v. 2.5.1) ? It can be installed either from github (https://github.com/saezlab/CARNIVAL) or devel version of bioconductor
https://bioconductor.org/packages/devel/bioc/html/CARNIVAL.html thanks Attila

jmj21-ic commented 2 years ago

Hi Attila,

I am using version 2.5.1, which I installed from GitHub.

Many thanks, Jess

jmj21-ic commented 2 years ago

Hi,

Previously I had installed PHONEMeS and CARNIVAL independently. I just removed my installations of CARNIVAL and PHONEMeS and re-installed PHONEMeS only, letting CARNIVAL re-install as part of that process. This fixed the problem with cplex!

Many thanks,

Jess

gabora commented 2 years ago

Hi Jess,

the problem was that the dir_name argument in run_phonemes() was set to an empty string: "" by default AND this passed the input checks in CARNIVAL. :/ therefore the CPLEX tried to write to the root folder, which is not accessible.

This we fixed yesterday in PR saezlab/PHONEMeS#9 on the side of phonemes and I added the corresponding check to CARNIVAL.

BTW, from now on, Phonemes will install the most recent CARNIVAL.

thanks for reporting this issue, best, Attila

jmj21-ic commented 2 years ago

Hi Attila,

Sorry to bother you again! I am now performing a similar analysis with COSMOS and I have run into the same issue:


Saving LP file
Done: Saving LP file: /rds/general/user/jmj21/home/Project2/COSMOS//lpFile_t12_05_18d06_05_2022n56.lp
12:05:18 06.05.2022 Solving LP problem
Writing cplex command file
Error in file(file, ifelse(append, "a", "w")) :
  cannot open the connection
Calls: preprocess_COSMOS_signaling_to_metabolism ... <Anonymous> -> writeCplexCommandFile -> write -> cat -> file
In addition: Warning message:
In file(file, ifelse(append, "a", "w")) :
  cannot open file '/cplexCommand_t12_05_18d06_05_2022n56.txt': Permission denied

I am currently using the github version of COSMOS. Is it posisble the same adjustment needs to be made here?

Many thanks,

Jess

gabora commented 2 years ago

Hi Jess,

looks like the same issue. But I fixed this some time ago.

Did you update your biocManager after the new release (3.15)? https://bioconductor.org/news/bioc_3_15_release/ The code BiocManager::version() should return 3.15. If it gives 3.14, then I suspect that COSMOS re-installed the older version of CARNIVAL from the previous release. CARNIVAL should be version 2.6.0.

I think i should fix this bug in previous releases too, but haven't done it yet.

let me know if this was the issue.

ps.: if you dont want to deal with updating your R version for the new Bioconductor release, then reinstall CARNIVAL from our github.

jmj21-ic commented 2 years ago

Hi Atilla,

Updating CARNIVAL has solved my issue - thank you!

Jess