Open c-tho opened 1 year ago
Confirmed, that seems to be a bug. Seems to be independent of the chosen distance.
Edit. A bit confusing because I have worked with stringdist on millions of records before.
Edit. The bug is irreproducible. Running the following script with R -f
multiple times sometimes gives a stack imbalance, sometimes not.
library(stringdist)
set.seed(1)
n <- 1000
x <- sample(0:9, size=n, replace=TRUE)
y <- sample(0:9, size=n, replace=TRUE)
out <- stringdist(x,y, method="osa", nthread=2)
It does not seem to occur with nthread=1
Edit As stated in the bugreport: this only occurs when stringdist
is provided an integer vector. Which is weird because stringdist
does not do anything special there: stringdist
casts all input to character
before any further processing. Even adding a single "a"
to x
and y
in the above script prevents the warning.
Running over vectors of 100k integers produces stack imbalance warnings at best and aborts the R session at worst:
Attempting three of these (for three date components) within a function aborts the session, with
This problem can be sidestepped by specifying
nthread = 1
. Default value forget_option("sd_num_thread")
for me is 7.sessionInfo: