immunogenomics / harmony

Fast, sensitive and accurate integration of single-cell data with Harmony
https://portals.broadinstitute.org/harmony/
Other
522 stars 100 forks source link

Warning message for Quick-TRANSfer #25

Open QqQss opened 5 years ago

QqQss commented 5 years ago

When I run 'RunHarmony' for a seurat object, it will return a Warning message:

Quick-TRANSfer stage steps exceeded maximum (= 6687600)

What is this going? Does this will affect the final result?

QqQss commented 5 years ago

I find a solution: (https://stackoverflow.com/questions/21382681/kmeans-quick-transfer-stage-steps-exceeded-maximum)

This warning may be caused by kmean algorithm. So could you tried to modified the default algorithm?

ilyakorsunsky commented 5 years ago

Thanks for finding this @QqQss! I will modify the code to allow you to try more iterations.

ersgupta commented 5 years ago

Curious, is this issue resolved? Also, do these warnings have any effect on the final result?

kaizen89 commented 4 years ago

@ilyakorsunsky I am still getting this warning, is this normal? Is the output affected?

Harmony converged after 3 iterations
Warning messages:
1: Quick-TRANSfer stage steps exceeded maximum (= 7022450) 

Thanks

ilyakorsunsky commented 4 years ago

Dear @QqQss, @ersgupta, and @kaizen89,

I modified the RunHarmony and HarmonyMatrix functions to allow you to specify the number of random restarts and number of iterations for the kmeans initialization. If you get the Quick-TRANSfer stage steps exceeded maximum warning, you can try to increase the number of maximum iterations and restarts:

obj <- RunHarmony(obj, 'dataset', kmeans_init_nstart=20, kmeans_init_iter_max=100)

Hope that helps!

Best, Ilya

madisonlayfield commented 4 years ago

I am having this same error. I am using the HarmonyMatrix function and I tried to add your suggestion to my line of code, but I am getting an "unused argument" error for (kmeans_init_nstart=20, kmeans_init_iter_max=100).

Thanks.

amjass12 commented 4 years ago

One question i have is, harmony continues to run and also converges even when this warning is present. can this warning therefore be ignored? thanks

yingyonghui commented 4 years ago

Hi @ilyakorsunsky , Thanks for your excellent tool. I just came up with the same warning message and I checked the parameters by "?RunHarmony". I'm wondering whether you mean to adjust the argument "kmeans_init_iter_max" or "max.iter.cluster", or even "max.iter.harmony"?

Best, Yingyong

mcap91 commented 2 years ago

I just did some troubleshooting following all of these posts and it turns out my error was coming from samples with Idents()set to NA. I was using this NA containing variable to integrate with harmony. Once I set the NA idents to something (I used "unknown" because the cells were not labeled), this warning message disappeared.

matasV99 commented 2 years ago

Hi,

After modifying the code like this: "obj <- RunHarmony(obj, 'dataset', kmeans_init_nstart=20, kmeans_init_iter_max=100)"

I get a very weird convergence plot. Any idea what this means? Untitled (13)

Thank you, Matas

NadineBestard commented 1 year ago

@matasV99

Hi,

After modifying the code like this: "obj <- RunHarmony(obj, 'dataset', kmeans_init_nstart=20, kmeans_init_iter_max=100)"

I get a very weird convergence plot. Any idea what this means? Untitled (13)

Thank you, Matas

https://github.com/immunogenomics/harmony/issues/48 here and a few other closed issues discuss this kind of output. Seems that there is no solution for it yet.

NadineBestard commented 1 year ago

When I tried to add kmeans_init_nstart argument I got the same problem as @madisonlayfield https://github.com/immunogenomics/harmony/issues/25#issuecomment-612697181. Creating a Seurat object from my matrix, metadata and PCA to be able to use RunHarmony() instead worked. Looks like @ilyakorsunsky forgot to update HarmonyMatrix() PS: I was working on the devel version of the package, updating that is the first thing I tried. Not sure if that's also a requirement for RunHarmony() to work, as the documentation is not updated.

cenk-celik commented 1 year ago

gc(full = T) before RunHarmony() worked for me.

umutcakir commented 1 year ago

I received the same warning message. I tried to set kmeans_init_iter_max and kmeans_init_nstart, and I used gc() command, but it still receive the same warning. Does this warning affect the result?

matasV99 commented 11 months ago

Hi @umutcakir @NadineBestard and others,

I revisited this problem after some time off. I am not a developer of this, so this is just my limited understanding of the problem.

Tl;dr: I did two things: a) increase the number of highly variable features I calculated my PCA latent space on and b) increased the number of random starts for the kmeans algorithm.

Longer answer: "1: Quick-TRANSfer stage steps exceeded maximum (= 9575900) " error means that "In rare cases, when some of the points (rows of ‘x’) are extremely close, the algorithm may not converge in the “Quick-Transfer” stage" (source: https://stackoverflow.com/questions/21382681/kmeans-quick-transfer-stage-steps-exceeded-maximum). My interpretation of this is that the latent PCA space has the points "too close to each other". In my case, I was running PCA in seurat with highly variable features that did not separate my data well enough (I only used 3000 and I had 150k cells). Once I ran PCA and Harmony with all features as input, harmony started converging. I still wanted to run Harmony with highly variable features in my dataset as that has been shown to improve batch correction and bio-signal preservation (source https://www.nature.com/articles/s41592-021-01336-8). Here is what functions I ran: " merged <- RunPCA(object= merged,features = top4000_features, verbose = FALSE)" and "merged <- RunHarmony(merged, assay.use="SCT", group.by.vars = "model", kmeans_init_nstart=100, kmeans_init_iter_max=5000, plot_convergence = TRUE, verbose = TRUE, epsilon.cluster = -Inf, epsilon.harmony = -Inf, max.iter.harmony = 50)".

pati-ni commented 11 months ago

https://github.com/immunogenomics/harmony/issues/25#issuecomment-1261843791

If can add my 2 cents here, the warnings most likely are unrelated with the convergence plot. The warnings are emitted by the kmeans from the R user space.

For the convergence, I would suggest to use the latest version of harmony which mitigates these types of situations.

To the warnings. This is the initialization procedure of the cluster centroids. It is known that kmeans initial centroids positions might affect the convergence but it is hard to predict the outcome as this will be very data dependent. Having said that, this is only the initial position of the centroids and harmony moves those substantially during the main harmony routine.

We are considering more robust and faster initialization methods for this step which would remove the kmeans and the warnings but the solution is under active development.

Hopefully this will address some of the concerns regarding the warnings.

me-orlov commented 11 months ago

Hello everyone and thank you for this discussion!

Just wanted to chime in and say that I am also having the issue with weird convergence plots. I have downloaded the latest developer version of Harmony, but the convergence issues sadly persist in my case. Is there any consensus on whether it is acceptable to use the data associated with convergence plots like the one above? Can anything else be done to mitigate?

Thank you in advance!

pati-ni commented 11 months ago

Hi @me-orlov,

What do you mean developer version? The devel branch is defunct upstream. Can you confirm the actual loaded version in your sessionInfo()?

pati-ni commented 10 months ago

@me-orlov

Also, please post the convergence plot

me-orlov commented 9 months ago

Hello pati-ni! My apologies for the late reply - I only now saw this. However, as the issue is still relevant, I would greatly appreciate it if you could help me out. The version of Harmony that I am running, according to SessionInfo(), is harmony_1.2.0.

My plot looks like this. I am applying Harmony to spatial data.

Screenshot_2024-01-16_17-21-03

pati-ni commented 9 months ago

@me-orlov we are still in the process of solving this type of behavior. One thing you could try is to set lambda to NULL so it does some parameter tuning during runtime.

yingyonghui commented 8 months ago

Hi all, is there any update on this issue? Any suggestions would be appreciated!

Thanks!

pati-ni commented 8 months ago

Hi @yingyonghui

Please try out setting lambda=NULL. We have more features coming up to deal with this behavior properly.

YiweiNiu commented 8 months ago

Hey, I tried to setting lambda=NULL but this warning persisted. The session infor

> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Berlin
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] harmony_1.2.0                           Rcpp_1.0.11                             SeuratWrappers_0.3.2                   
 [4] Homo.sapiens_1.3.1                      TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 org.Hs.eg.db_3.18.0                    
 [7] GO.db_3.18.0                            OrganismDbi_1.44.0                      GenomicFeatures_1.54.1                 
[10] GenomicRanges_1.54.1                    GenomeInfoDb_1.38.1                     AnnotationDbi_1.64.1                   
[13] IRanges_2.36.0                          S4Vectors_0.40.2                        Biobase_2.62.0                         
[16] BiocGenerics_0.48.1                     RColorBrewer_1.1-3                      viridis_0.6.4                          
[19] viridisLite_0.4.2                       Seurat_5.0.1                            SeuratObject_5.0.1                     
[22] sp_2.1-2                                ggsci_3.0.0                             cowplot_1.1.1                          
[25] patchwork_1.1.3                         scattermore_1.2                         lubridate_1.9.3                        
[28] forcats_1.0.0                           stringr_1.5.1                           dplyr_1.1.3                            
[31] purrr_1.0.2                             readr_2.1.4                             tidyr_1.3.0                            
[34] tibble_3.2.1                            ggplot2_3.4.4                           tidyverse_2.0.0                        
[37] workflowr_1.7.1                        

loaded via a namespace (and not attached):
  [1] fs_1.6.3                    matrixStats_1.1.0           spatstat.sparse_3.0-3       bitops_1.0-7               
  [5] httr_1.4.7                  tools_4.3.2                 sctransform_0.4.1           utf8_1.2.4                 
  [9] R6_2.5.1                    lazyeval_0.2.2              uwot_0.1.16                 withr_2.5.2                
 [13] prettyunits_1.2.0           gridExtra_2.3               progressr_0.14.0            textshaping_0.3.7          
 [17] cli_3.6.1                   spatstat.explore_3.2-5      fastDummies_1.7.3           labeling_0.4.3             
 [21] spatstat.data_3.0-3         ggridges_0.5.4              pbapply_1.7-2               systemfonts_1.0.5          
 [25] Rsamtools_2.18.0            R.utils_2.12.3              parallelly_1.36.0           rstudioapi_0.15.0          
 [29] RSQLite_2.3.4               generics_0.1.3              BiocIO_1.12.0               ica_1.0-3                  
 [33] spatstat.random_3.2-2       Matrix_1.6-4                fansi_1.0.5                 abind_1.4-5                
 [37] R.methodsS3_1.8.2           lifecycle_1.0.3             whisker_0.4.1               yaml_2.3.7                 
 [41] SummarizedExperiment_1.32.0 SparseArray_1.2.2           BiocFileCache_2.10.1        Rtsne_0.17                 
 [45] grid_4.3.2                  blob_1.2.4                  promises_1.2.1              crayon_1.5.2               
 [49] miniUI_0.1.1.1              lattice_0.21-9              KEGGREST_1.42.0             pillar_1.9.0               
 [53] knitr_1.45                  rjson_0.2.21                future.apply_1.11.0         codetools_0.2-19           
 [57] leiden_0.4.3.1              glue_1.6.2                  getPass_0.2-2               remotes_2.4.2.1            
 [61] data.table_1.14.8           vctrs_0.6.4                 png_0.1-8                   spam_2.10-0                
 [65] gtable_0.3.4                cachem_1.0.8                xfun_0.41                   S4Arrays_1.2.0             
 [69] mime_0.12                   survival_3.5-7              ellipsis_0.3.2              fitdistrplus_1.1-11        
 [73] ROCR_1.0-11                 nlme_3.1-163                bit64_4.0.5                 progress_1.2.3             
 [77] filelock_1.0.3              RcppAnnoy_0.0.21            rprojroot_2.0.4             irlba_2.3.5.1              
 [81] KernSmooth_2.23-22          colorspace_2.1-0            DBI_1.1.3                   tidyselect_1.2.0           
 [85] processx_3.8.2              bit_4.0.5                   compiler_4.3.2              curl_5.1.0                 
 [89] git2r_0.33.0                graph_1.80.0                xml2_1.3.6                  DelayedArray_0.28.0        
 [93] plotly_4.10.3               rtracklayer_1.62.0          scales_1.3.0                lmtest_0.9-40              
 [97] RBGL_1.78.0                 callr_3.7.3                 rappdirs_0.3.3              digest_0.6.33              
[101] goftest_1.2-3               spatstat.utils_3.0-4        rmarkdown_2.25              RhpcBLASctl_0.23-42        
[105] XVector_0.42.0              htmltools_0.5.6.1           pkgconfig_2.0.3             MatrixGenerics_1.14.0      
[109] dbplyr_2.4.0                fastmap_1.1.1               rlang_1.1.1                 htmlwidgets_1.6.2          
[113] shiny_1.8.0                 farver_2.1.1                zoo_1.8-12                  jsonlite_1.8.8             
[117] BiocParallel_1.36.0         R.oo_1.25.0                 RCurl_1.98-1.13             magrittr_2.0.3             
[121] GenomeInfoDbData_1.2.11     dotCall64_1.1-1             munsell_0.5.0               reticulate_1.34.0          
[125] stringi_1.8.2               zlibbioc_1.48.0             MASS_7.3-60                 plyr_1.8.9                 
[129] parallel_4.3.2              listenv_0.9.0               ggrepel_0.9.4               deldir_2.0-2               
[133] Biostrings_2.70.1           splines_4.3.2               tensor_1.5                  hms_1.1.3                  
[137] ps_1.7.5                    igraph_1.5.1                spatstat.geom_3.2-7         RcppHNSW_0.5.0             
[141] reshape2_1.4.4              biomaRt_2.58.0              XML_3.99-0.16               evaluate_0.23              
[145] renv_1.0.3                  BiocManager_1.30.22         tzdb_0.4.0                  httpuv_1.6.13              
[149] RANN_2.6.1                  polyclip_1.10-6             future_1.33.0               rsvd_1.0.5                 
[153] xtable_1.8-4                restfulr_0.0.15             RSpectra_0.16-1             later_1.3.2                
[157] ragg_1.2.6                  memoise_2.0.1               GenomicAlignments_1.38.0    cluster_2.1.4              
[161] timechange_0.2.0            globals_0.16.2              here_1.0.1 
leonfodoulian commented 6 months ago

Hi @pati-ni

Hi @yingyonghui

Please try out setting lambda=NULL. We have more features coming up to deal with this behavior properly.

Could you please provide more information regarding when the new features will be implemented and released?

Best, Leon

AmelZulji commented 1 week ago

Hi @pati-ni,

sorry to bother on this. All of the suggestions above didnt work for me, and I cant obtain stable results no matter what i try. I need to make a decision and would be great if you can provide an estimate when the new features will be implemented.

thank you and wish you a nice day