mpadge / spatialcluster

spatially-constrained clustering in R
https://mpadge.github.io/spatialcluster/
30 stars 6 forks source link

C++ error #22

Closed Nowosad closed 3 years ago

Nowosad commented 4 years ago

Hi @mpadge, I've been trying to use spatialcluster on a medium-size dataset (few thousands points). In my case, it makes sense to have a large number of clusters (hundreds). However, I tested the code on two computers and R crashes when I set ncl to more than ~20. See an artificial example below:

n <- 1000
xy <- matrix (runif (2 * n), ncol = 2)
dmat <- matrix (runif (n ^ 2), ncol = n)

library (spatialcluster)
# works
scl <- scl_redcap (xy, dmat, ncl = 8, linkage = "single")
plot (scl)

# R crashes (both inside and outside of RStudio)
scl2 <- scl_redcap (xy, dmat, ncl = 50, linkage = "single")
plot (scl2)
mpadge commented 4 years ago

Hey Jakub, thanks for the ping. This repo looks like it might've been abandoned, but far from. It was part of a now-completed project, but I've been meaning to get it finished and on CRAN for a good part of this year. Your issue provides good motivation to get going once again, so I shall dig in asap and get back to you. Thanks!

mpadge commented 3 years ago

Yo @Nowosad, I'm sure i'm not the only one getting around to responding to issues over a year later in these extraordinary times. With due and understandable apologies, finally back on to getting this package happening. I was initially able to repeat your error, then had to immediately make some minor changes cause of the breaking updates to tibble ... and now i can no longer recreate the error. Getting 50 clusters takes a long time, but it worked for me at least 10 times without failing. Can you maybe just check please? And ensure that you set.seed() first, to aid reproducibility? Thanks!

Nowosad commented 3 years ago

Hi Mark, no problem. I hope this year was not very tough on you.

Regarding the issue - the crash is still happens for the second example:

# remotes::install_github("mpadge/spatialcluster")
library(spatialcluster)

set.seed(2020-12-14)
n <- 1000
xy <- matrix (runif (2 * n), ncol = 2)
dmat <- matrix (runif (n ^ 2), ncol = n)

# works
scl <- scl_redcap (xy, dmat, ncl = 8, linkage = "single")
plot (scl)

# R crashes (both inside and outside of RStudio)
scl2 <- scl_redcap (xy, dmat, ncl = 50, linkage = "single")
plot (scl2)
Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.3 (2020-10-10) #> os Fedora 32 (Thirty Two) #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/Warsaw #> date 2020-12-14 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2) #> callr 3.5.1 2020-10-13 [1] CRAN (R 4.0.2) #> cli 2.2.0 2020-11-20 [1] CRAN (R 4.0.3) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.2) #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.2) #> devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.3) #> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.3) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.2) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.2) #> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2) #> knitr 1.30 2020-09-22 [1] CRAN (R 4.0.3) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.2) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.3) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2) #> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2) #> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2) #> processx 3.4.5 2020-11-30 [1] CRAN (R 4.0.3) #> ps 1.5.0 2020-12-05 [1] CRAN (R 4.0.3) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.2) #> R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.3) #> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2) #> rlang 0.4.9 2020-11-26 [1] CRAN (R 4.0.3) #> rmarkdown 2.5 2020-10-21 [1] CRAN (R 4.0.2) #> rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.0.3) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2) #> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2) #> testthat 3.0.0 2020-10-31 [1] CRAN (R 4.0.3) #> usethis 1.9.0.9000 2020-11-04 [1] Github (r-lib/usethis@330527b) #> withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.2) #> xfun 0.19 2020-10-30 [1] CRAN (R 4.0.3) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2) #> #> [1] /home/jn/R/x86_64-redhat-linux-gnu-library/4.0 #> [2] /usr/lib64/R/library #> [3] /usr/share/R/library ```
mpadge commented 3 years ago

@Nowosad So I can't reproduce that, and the previous commit added your exact code in tests/ to the github runners, and they all passed too, on both Linux and windows. Given that, and presuming that you've got better things to do that dig any further for now, how about this for a suggestion: I have finally re-started this package, and am striving to get it on to CRAN as soon as possible, definitely sometime in Jan 21. Once there, hopefully others might find analogous bugs that are easier to diagnose and fix. Presuming that, how about we close this for now, and revisit later if needed? Feel free to close if you agree. Thanks for the input!

Nowosad commented 3 years ago

Hi Mark - yes, sure, you we can close this issue for now.