Error in run_gene_precomputation_at_scale_round_2(pod_id = pod_id, gene_precomp_dir = dirs[["gene_precomp_dir"]], : task 2 failed - "cannot open the connection"

ggolczer commented 1 year ago

While running run_sceptre_high_moi function the function fails everytime when run_gene_precomputation_at_scale_round_2 giving the following error:

Error in run_gene_precomputation_at_scale_round_2(pod_id = pod_id, gene_precomp_dir = dirs[["gene_precomp_dir"]], : task 2 failed - "cannot open the connection" In addition: Warning messages: 1: In run_sceptre_high_moi_40cores(gene_matrix = matrix, B = 1000, : Removing the following genes with low expressions (UMI count <250): 2: In mclapply(argsList, FUN, mc.preschedule = preschedule, mc.set.seed = set.seed, : scheduled cores 2, 4, 6, 9, 12, 13, 14, 15, 16, 17, 20, 22, 24, 27, 30, 32, 34, 37, 38, 39, 40 did not deliver results, all values of the jobs will be affected

R version 4.2.2 (2022-10-31) Platform: x86_64-conda-linux-gnu (64-bit) Running under: Ubuntu 16.04.6 LTS

Matrix products: default BLAS/LAPACK: /home/ubuntu/miniconda2/envs/new_sceptre/lib/libopenblasp-r0.3.21.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] magrittr_2.0.3 dplyr_1.1.0 fstcore_0.9.12 Matrix_1.5-3 sceptre_0.1.0

loaded via a namespace (and not attached): [1] Rcpp_1.0.10 lattice_0.20-45 codetools_0.2-19 fansi_1.0.4
[5] crayon_1.5.2 withr_2.5.0 utf8_1.2.3 foreach_1.5.2
[9] grid_4.2.2 R6_2.5.1 lifecycle_1.0.3 pillar_1.8.1
[13] rlang_1.0.6 cli_3.6.0 doParallel_1.0.17 vctrs_0.5.2
[17] generics_0.1.3 fst_0.9.8 iterators_1.0.14 glue_1.6.2
[21] parallel_4.2.2 compiler_4.2.2 pkgconfig_2.0.3 tidyselect_1.2.0 [25] tibble_3.1.8

It doesn't matter how many cores do I use. I had used an AWS instance of 371 Gb of RAM and 96 CPU cores. I had run SCEPTRE in its first versions and it works fine in this type of AWS instance in the past. Wanted to try the new version to streamline the analysis and can't make it past the 2nd round of gene precomputations.

timothy-barry commented 1 year ago

Hi,

Thanks for getting in touch. Before getting into solutions, I'll note that we will be releasing a substantial update to sceptre within the next few months. The update will circumvent reading/writing data to disk and will therefore very likely resolve your bug. So if you are able to wait a few months for an update, that might be the best solution.

That said, I do not this that this is an issue related to cores or memory. It is instead related to reading/writing data to disk. What is your regularization_amount parameter set to?

ggolczer commented 1 year ago

Hi Tim, thanks for taking the time. I actually have been using SCEPTRE since 2021 since it became available in preprint but notice the new version 0.1.0 a few days ago so I was eager to use it since it simplified the whole process. That said, I apologize for not putting the full command on my initial message. See it below:

result <- run_sceptre_high_moi(gene_matrix = matrix,B = 1000,combined_perturbation_matrix = gRNA_indicators_t_sparse,covariate_matrix = cov_matrix,gene_gRNA_group_pairs = group_pair_table[,c('gene_id','gRNA_group')],side = 'left', storage_dir = '/data/new_sceptre/new_sceptre_40' , full_output= TRUE, parallel=TRUE)

So regularization_amount is set to default = 0 as specified in the documentation.

That said, also want to clarify that this is a large screen , 200k cells and 2200 gRNA groups, I am also working on implementing the Nextflow pipeline (as recommended by you in the Readme) in our AWS Cloud 9 / AWS batch . But I will be commenting on the proper repository about that process as well.

Thanks again

timothy-barry commented 1 year ago

Hi Gabriel,

This is all good to hear. It sounds like you are working with some very exciting data. Could you try setting parallel to FALSE and seeing what happens?

I do think that the Nextflow pipeline would be well-suited to your data. Please feel free to open an issue on that repo, and we can discuss.

timothy-barry commented 1 year ago

Exchange continued over email.

Katsevich-Lab / sceptre

Error in run_gene_precomputation_at_scale_round_2(pod_id = pod_id, gene_precomp_dir = dirs[["gene_precomp_dir"]], : task 2 failed - "cannot open the connection" #23