mhahsler / dbscan

Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms - R package
GNU General Public License v3.0
314 stars 64 forks source link

R session aborted in pointdensity() #54

Closed soelderer closed 10 months ago

soelderer commented 1 year ago

Hey!

I'm using tidySEM to visualise structural equation models. The package calls dbscan::pointdensity() at some point.

For a specific case, this led to an "R session aborted" when calling pointdensity().

I can consistently reproduce this on my machine with the following MWE:

library(dbscan)

tmp <- structure(list(x = c(5, 6, 7, 9, 10, 11.25, 5.05, 6.3, 7, 9, 
9.7, 10.95, 3, 3, 3.3, 3.3, 3, 3, 13, 13, 12.7, 12.7, 13, 13, 
8, 8, 5.95, 10.05, 8, 6, 9.75, 5.75, 10, 8, NaN), y = c(16.05, 
16.05, 16.05, 16.05, 16.05, 15.8, 4, 3.75, 3.95, 3.95, 3.75, 
4, 13.05, 12.05, 11.25, 8.75, 7.95, 6.95, 13.05, 12.05, 11.25, 
8.75, 7.95, 6.95, 11.95, 8.05, 10, 10, 10, 12, 11.75, 7.75, 8, 
10, NaN)), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 
23L, 24L, 30L, 31L, 32L, 33L, 58L, 59L, 60L, 61L, 62L, 63L, 35L
), class = "data.frame")

pointdensity(x = tmp, eps = 5)

I hope this is reproducible.

Best, Paul

mhahsler commented 1 year ago

I am not able to reproduce the issue. Please post the output of sessionInfo() right before pointdensity() is called.

soelderer commented 1 year ago

Hey, this is my sessionInfo():

> sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

Matrix products: default
BLAS:   /usr/lib/libblas.so.3.11.0 
LAPACK: /usr/lib/liblapack.so.3.11.0

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8    LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Vienna
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dbscan_1.1-11

loaded via a namespace (and not attached):
[1] compiler_4.3.0 tools_4.3.0    Rcpp_1.0.10  

Edit: I used debug() in Rstudio and the issue apparently happens when calling .Call(`_dbscan_dbscan_density_int`, data, eps, type, bucketSize, splitRule, approx) in dbscan_density_int().

I just checked my systemd logs, and apparently the dbscan.so core dumped:

systemd-coredump[141140]: [🡕] Process 140242 (rsession) of user 1000 dumped core.

                                                           Stack trace of thread 140242:
                                                           #0  0x00007fd43ee70513 n/a (/home/soelderer/R/x86_64-pc-linux-gnu-library/4.3/dbscan/libs/dbscan.so + 0x50513)
                                                           #1  0x00005603cdbab250 n/a (n/a + 0x0)
                                                           ELF object binary architecture: AMD x86-64

Edit2: Since I have dual boot, I just tried it on Windows and ran into the same issue. I freshly installed R, Rstudio and dbscan to test this (didn't have them installed priorly).

On Windows, however, it didn't happen all of the time. Sometimes it apparently worked and I got the following output:

> pointdensity(x = tmp, eps = 5)
[1]  8  9  9  9 10  8  9 10 10 10 10  7 10 10 13 13  9 10 10 10 14 13  9 10 13 11 17 17 15 17 14 12 17 15 10

sessionInfo():

> sessionInfo()
R version 4.3.0 (2023-04-21 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22000)

Matrix products: default

locale:
[1] LC_COLLATE=German_Austria.utf8  LC_CTYPE=German_Austria.utf8    LC_MONETARY=German_Austria.utf8 LC_NUMERIC=C                   
[5] LC_TIME=German_Austria.utf8    

time zone: Europe/Vienna
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dbscan_1.1-11

loaded via a namespace (and not attached):
[1] compiler_4.3.0 tools_4.3.0    Rcpp_1.0.10   
mhahsler commented 1 year ago

I can reproduce the issue on Windows. It is caused by the missing values.

mhahsler commented 10 months ago

This was fixed in dbscan 1.1-12.