Permutations when large data frame

In https://github.com/thomasp85/lime/blob/ca363ec02b511cd7c67bfd99533f3ec3e3538ce7/R/permute_cases.R#L11, if we have over 858993 records, as 858993 * 5000 ≈ 2^32, then sample.int throws an error. It may be even lower than that. Maybe indicate that you can't have as large of a data set.

library(lime)
library(caret)
#> Loading required package: lattice
#> Loading required package: ggplot2
n = 600000
x = rnorm(n)
y = (x^2 + 2) > 4

df = data.frame(y = factor(y*1), x =x)
train_df = df[ 1:1000, ]
model = train(y ~ x, data = train_df, method = "knn")
explainer = lime(x = df, model)
xx = explain(df, explainer, n_labels = 2, n_features = 5)
#> Warning in sample.int(length(x), size, replace, prob): NAs introduced by
#> coercion to integer range
#> Error in sample.int(length(x), size, replace, prob): invalid 'size' argument

library(lime)
library(caret)
n = 20000
x = rnorm(n)
y = (x^2 + 2) > 4

df = data.frame(y = factor(y*1), x =x)
train_df = df[ 1:1000, ]
model = train(y ~ x, data = train_df, method = "knn")
explainer = lime(x = df, model)
xx = explain(df, explainer, n_labels = 2, n_features = 5)

^{Created on 2019-04-15 by the reprex package (v0.2.1)}

thomasp85 / lime

Permutations when large data frame #153