FertigLab / CoGAPS

Bayesian MCMC matrix factorization algorithm
https://www.bioconductor.org/packages/release/bioc/html/CoGAPS.html
BSD 3-Clause "New" or "Revised" License
61 stars 17 forks source link

Error when performing getPatternHallmarks: Error in 'collect()' #88

Closed LiuCanidk closed 5 months ago

LiuCanidk commented 5 months ago

Original code:

params <- new("CogapsParams") params getParam(params, "nPatterns") params <- setParam(params, "nPatterns", 5) getParam(params, "nPatterns") cogapsresult <- CoGAPS(log(tpm.clean+1), params, outputFrequency = 10000) cogapsresult pm <- patternMarkers(cogapsresult, threshold="cut") pm$PatternMarkers hallmarks=getPatternHallmarks(cogapsresult)

Error:

Error in collect(): ! Failed to collect lazy table. Caused by error in db_collect(): ! Arguments in ... must be used. ✖ Problematic argument: • ..1 = Inf ℹ Did you misspell an argument name? Run rlang::last_trace() to see where the error occurred.

I don't know why this error occured, due to the version or something else like package conflict? My sessionInfo was as follows:

R version 4.3.1 (2023-06-16 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale: [1] LC_COLLATE=Chinese (Simplified)_China.utf8 LC_CTYPE=Chinese (Simplified)_China.utf8
[3] LC_MONETARY=Chinese (Simplified)_China.utf8 LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.utf8

time zone: Asia/Shanghai tzcode source: internal

attached base packages: [1] grid stats graphics grDevices utils datasets methods base

other attached packages: [1] ggplotify_0.1.2 ComplexHeatmap_2.16.0 CoGAPS_3.23.2 ggplot2_3.5.0
[5] Seurat_5.0.3 SeuratObject_5.0.1 sp_2.1-3 phateR_1.0.7
[9] Matrix_1.6-5

loaded via a namespace (and not attached): [1] fs_1.6.3 matrixStats_1.2.0 spatstat.sparse_3.0-3
[4] bitops_1.0-7 httr_1.4.7 RColorBrewer_1.1-3
[7] doParallel_1.0.17 tools_4.3.1 sctransform_0.4.1
[10] utf8_1.2.4 R6_2.5.1 lazyeval_0.2.2
[13] uwot_0.1.16 rhdf5filters_1.12.1 GetoptLong_1.0.5
[16] withr_3.0.0 prettyunits_1.2.0 gridExtra_2.3
[19] progressr_0.14.0 cli_3.6.2 Biobase_2.60.0
[22] textshaping_0.3.7 Cairo_1.6-2 spatstat.explore_3.2-7
[25] fastDummies_1.7.3 labeling_0.4.3 spatstat.data_3.0-4
[28] ggridges_0.5.6 pbapply_1.7-2 systemfonts_1.0.6
[31] yulab.utils_0.1.4 parallelly_1.37.1 rstudioapi_0.16.0
[34] RSQLite_2.3.6 generics_0.1.3 gridGraphics_0.5-1
[37] shape_1.4.6.1 gtools_3.9.5 ica_1.0-3
[40] spatstat.random_3.2-3 dplyr_1.1.4 fansi_1.0.6
[43] S4Vectors_0.38.2 abind_1.4-5 lifecycle_1.0.4
[46] SummarizedExperiment_1.30.2 gplots_3.1.3.1 rhdf5_2.44.0
[49] BiocFileCache_2.8.0 Rtsne_0.17 blob_1.2.4
[52] promises_1.2.1 crayon_1.5.2 miniUI_0.1.1.1
[55] lattice_0.22-6 msigdbr_7.5.1 cowplot_1.1.3
[58] KEGGREST_1.40.1 pillar_1.9.0 fgsea_1.26.0
[61] GenomicRanges_1.52.1 rjson_0.2.21 future.apply_1.11.2
[64] codetools_0.2-20 fastmatch_1.1-4 leiden_0.4.3.1
[67] glue_1.7.0 data.table_1.15.2 remotes_2.5.0
[70] vctrs_0.6.5 png_0.1-8 spam_2.10-0
[73] gtable_0.3.4 cachem_1.0.8 S4Arrays_1.0.6
[76] mime_0.12 survival_3.5-8 SingleCellExperiment_1.22.0 [79] iterators_1.0.14 fitdistrplus_1.1-11 ROCR_1.0-11
[82] nlme_3.1-164 bit64_4.0.5 progress_1.2.3
[85] filelock_1.0.3 RcppAnnoy_0.0.22 GenomeInfoDb_1.36.4
[88] irlba_2.3.5.1 KernSmooth_2.23-22 colorspace_2.1-0
[91] BiocGenerics_0.46.0 DBI_1.2.2 tidyselect_1.2.1
[94] processx_3.8.4 bit_4.0.5 compiler_4.3.1
[97] curl_5.2.1 xml2_1.3.6 desc_1.4.3
[100] DelayedArray_0.26.7 plotly_4.10.4 scales_1.3.0
[103] caTools_1.18.2 lmtest_0.9-40 callr_3.7.6
[106] rappdirs_0.3.3 stringr_1.5.1 digest_0.6.35
[109] goftest_1.2-3 spatstat.utils_3.0-4 XVector_0.40.0
[112] htmltools_0.5.8 pkgconfig_2.0.3 MatrixGenerics_1.12.3
[115] dbplyr_2.5.0 fastmap_1.1.1 rlang_1.1.3
[118] GlobalOptions_0.1.2 htmlwidgets_1.6.4 shiny_1.8.1
[121] farver_2.1.1 zoo_1.8-12 jsonlite_1.8.8
[124] BiocParallel_1.34.2 RCurl_1.98-1.14 magrittr_2.0.3
[127] GenomeInfoDbData_1.2.10 dotCall64_1.1-1 patchwork_1.2.0
[130] Rhdf5lib_1.22.1 munsell_0.5.1 Rcpp_1.0.12
[133] babelgene_22.9 reticulate_1.35.0 stringi_1.8.3
[136] zlibbioc_1.46.0 MASS_7.3-60.0.1 plyr_1.8.9
[139] pkgbuild_1.4.4 parallel_4.3.1 listenv_0.9.1
[142] ggrepel_0.9.5 forcats_1.0.0 deldir_2.0-4
[145] Biostrings_2.68.1 splines_4.3.1 tensor_1.5
[148] hms_1.1.3 circlize_0.4.16 ps_1.7.6
[151] igraph_2.0.3 spatstat.geom_3.2-9 RcppHNSW_0.6.0
[154] reshape2_1.4.4 biomaRt_2.56.1 stats4_4.3.1
[157] pkgload_1.3.4 XML_3.99-0.16.1 BiocManager_1.30.22
[160] foreach_1.5.2 httpuv_1.6.14 RANN_2.6.1
[163] tidyr_1.3.1 purrr_1.0.2 polyclip_1.10-6
[166] future_1.33.2 clue_0.3-65 scattermore_1.2
[169] xtable_1.8-4 RSpectra_0.16-1 later_1.3.2
[172] viridisLite_0.4.2 ragg_1.3.0 tibble_3.2.1
[175] memoise_2.0.1 AnnotationDbi_1.62.2 IRanges_2.34.1
[178] cluster_2.1.6 globals_0.16.3

dimalvovs commented 5 months ago

thanks for posting this, I was able to generate an error on the test data using your code, although the error is different. it looks definitely related to #87

data(GIST)
tpm.clean <- GIST.matrix
params <- new("CogapsParams")
params
getParam(params, "nPatterns")
params <- setParam(params, "nPatterns", 5)
getParam(params, "nPatterns")
cogapsresult <- CoGAPS(log(tpm.clean+1), params, outputFrequency = 10000)
cogapsresult
pm <- patternMarkers(cogapsresult, threshold="cut")
pm$PatternMarkers
hallmarks<-getPatternHallmarks(cogapsresult)

yields

Error: Your query has been redirected to https://status.ensembl.org indicating this Ensembl service is currently unavailable.
Look at ?useEnsembl for details on how to try a mirror site.

followed by

Ensembl site unresponsive, trying useast mirror
Ensembl site unresponsive, trying asia mirror
Error in .chooseEnsemblMirror(mirror = mirror, httr_config = httr_config) : 
  Unable to query any Ensembl site

was querying ensembl successful for you, @LiuCanidk?

LiuCanidk commented 5 months ago

@dimalvovs Thanks for your quick reply. For the url: http://status.ensembl.org, I did have no access, . But for the other mirrors, I can visit them.

Besides, I use the test data of GIST, but also yielded exactly the same error about the collect(), failed to collect the lazy table. And this time I only run CoGAPS, and the getPatternHallmarks function:

cogapsresult=CoGAPS(log1p(GIST.matrix), params, outputFrequency = 10000) hallmarks=getPatternHallmarks(cogapsresult)

so it may be not associated with ensembl query, but associated with my sessionInfo (same code, same data, but different errors) ? I'm not sure if I missed some components in the package, but all other functions and the key function CoGAPS worked all right. Any package conflict?

dimalvovs commented 5 months ago

@LiuCanidk I still have trouble getting past the ensembl wall, but to rule out possible local conflicts you may try running it in the container: docker run -it --entrypoint /bin/bash ghcr.io/fertiglab/cogaps:3.21.5

dimalvovs commented 5 months ago

I've just able to get the hallmarks form GIST.matrix in docker, but not yet outside of it

> library(CoGAPS)
> data(GIST)
> tpm.clean <- GIST.matrix
> params <- new("CogapsParams")
> params <- setParam(params, "nPatterns", 5)
> cogapsresult <- CoGAPS(log(tpm.clean+1), params, outputFrequency = 10000)

This is CoGAPS version 3.21.5 
Running Standard CoGAPS on log(tpm.clean + 1) (1363 genes and 9 samples) with parameters:

-- Standard Parameters --
nPatterns            5 
nIterations          50000 
seed                 549 
sparseOptimization   FALSE 

-- Sparsity Parameters --
alpha          0.01 
maxGibbsMass   100 

Data Model: Dense, Normal
Sampler Type: Sequential
Loading Data...Done! (00:00:00)
-- Equilibration Phase --
10000 of 50000, Atoms: 2557(A), 74(P), ChiSq: 5937, Time: 00:00:10 / 00:02:08
20000 of 50000, Atoms: 2835(A), 76(P), ChiSq: 4049, Time: 00:00:21 / 00:02:03
30000 of 50000, Atoms: 3044(A), 85(P), ChiSq: 3675, Time: 00:00:33 / 00:02:04
40000 of 50000, Atoms: 2905(A), 90(P), ChiSq: 3708, Time: 00:00:45 / 00:02:03
50000 of 50000, Atoms: 2978(A), 90(P), ChiSq: 3663, Time: 00:00:57 / 00:02:02
-- Sampling Phase --
10000 of 50000, Atoms: 2913(A), 84(P), ChiSq: 3667, Time: 00:01:10 / 00:02:02
20000 of 50000, Atoms: 2929(A), 93(P), ChiSq: 3705, Time: 00:01:23 / 00:02:02
30000 of 50000, Atoms: 2946(A), 85(P), ChiSq: 3651, Time: 00:01:36 / 00:02:02
40000 of 50000, Atoms: 3021(A), 83(P), ChiSq: 3697, Time: 00:01:49 / 00:02:02
50000 of 50000, Atoms: 2938(A), 88(P), ChiSq: 3562, Time: 00:02:02 / 00:02:02
> pm <- patternMarkers(cogapsresult, threshold="cut")
Warning message:
In sweep(As, 2, pscale, FUN = "*") :
  STATS is longer than the extent of 'dim(x)[MARGIN]'
> hallmarks<-getPatternHallmarks(cogapsresult)
Warning message:
In sweep(As, 2, pscale, FUN = "*") :
  STATS is longer than the extent of 'dim(x)[MARGIN]'
> hallmarks
[[1]]
                                       pathway pval padj overlap size
 1:                      HALLMARK_ADIPOGENESIS    1    1       0  200
 2:               HALLMARK_ALLOGRAFT_REJECTION    1    1       0  200
 3:                 HALLMARK_ANDROGEN_RESPONSE    1    1       0  100
 4:                      HALLMARK_ANGIOGENESIS    1    1       0   36
 5:                   HALLMARK_APICAL_JUNCTION    1    1       0  199
 6:                    HALLMARK_APICAL_SURFACE    1    1       0   44
 7:                         HALLMARK_APOPTOSIS    1    1       0  161
 8:              HALLMARK_BILE_ACID_METABOLISM    1    1       0  112
 9:           HALLMARK_CHOLESTEROL_HOMEOSTASIS    1    1       0   74
10:                       HALLMARK_COAGULATION    1    1       0  138
11:                        HALLMARK_COMPLEMENT    1    1       0  200
LiuCanidk commented 5 months ago

@dimalvovs Hi, I just rerun the code as you provided, the same error occured, but I found some differences in my warning in the step of patternMarkers:

Warning messages: 1: Not a validObject(): "checkpointInterval"槽名不存在于"CogapsParams"类别对象中 2: In sweep(As, 2, pscale, FUN = "*") : STATS大于'dim(x)[MARGIN]'的范围

Warning 2 was the same as yours, but another warning 1 occured: no checkpointInterval in CogapsParams, and I checked for this: image it was zero. I guess there must be some local discrepancy between our environments

As for your advice, I'm not so familar with the uasge of container. I plan to run the container on the server using singularity, but found invalid container ref for "ghcr.io/fertiglab/cogaps:3.21.5", and after searching the container in the CoGAPS guide, I found there was only PyCoGAPS container? Is there any R cogaps container?

LiuCanidk commented 5 months ago

@dimalvovs I just run the code on a distributed server (Linux), and the version I used was CoGAPS 3.22.0. It worked well when I ran on the login node, and failed when I ran on the calculating node, and encountered the same error as you mentioned about the querying ensembl. I guess there are some issues about the internet proxy.

How did you fixed the bug of unsuccessful querying ensembl?

dimalvovs commented 5 months ago

@LiuCanidk it just worked on a try number X, and I know that the issue is being handled in #89

dimalvovs commented 5 months ago

@LiuCanidk I missed your question, sorry

Is there any R cogaps container?

yes, here

dimalvovs commented 5 months ago

closing this issue as getPatternHallmarks is deprecated (#92)