I found the Permn function in this package, which handles repeated values by only returning unique permutations. However, I looked at the implementation, and it still expands the full N! permutations before deduplicating them, which makes it very slow even in certain cases where the return value is not very large, e.g. Permn(c(rep(1, 10), 2)). I made an implementation of this that runs in time approximately proportional to the size of the output, but it doesn't currently support the bells and whistles of Permn, and the interface and implementation are a bit rough. If you are interested in using this, please let me know and I will clean it up and create a pull request. Or you can adapt the algorithm on your own.
library(dplyr)
all_unique_perms <- function(
x, max_allowed_perms = 1e6,
# This argument is used only for recursion
x_table
) {
if (!missing(x)) {
assert_that(!any(is.na(x)))
if (!missing(x_table)) {
stop("Cannot accept both x and x_table arguments")
}
x_table <- tibble(x=x) %>%
group_by(x) %>%
summarise(count = n(), .groups = "drop") %>%
arrange(desc(count)) %>%
filter(count > 0)
if (is.finite(max_allowed_perms)) {
assert_that(factorial_product_ratio(sum(x_table$count), x_table$count) <= max_allowed_perms)
}
}
if (nrow(x_table) == 0) {
# x is empty
return(matrix(x_table$x, nrow = 0, ncol = 0))
} else if (nrow(x_table) == 1) {
# Base case: x is all identical, so only 1 unique permutation
return(rbind(rep(x_table$x, x_table$count), deparse.level = 0))
} else if (all(x_table$count == 1)) {
# Base case: all values distinct, so just do regular permutation
return(Permn(x_table$x))
} else if (nrow(x_table) == 2) {
# Base case: x has 2 distinct values, so permutation is equivalent to
# finding all unique ways to thread one of the values between the other.
combs <- combn(sum(x_table$count), x_table$count[1])
# Initialize with all 2nd value, then fill in 1st value wherever the
# above combinations say to
result <- matrix(x_table$x[2], nrow = ncol(combs), ncol=sum(x_table$count))
i1 <- cbind(as.vector(col(combs)), as.vector(combs))
result[i1] <- x_table$x[1]
return(result)
} else {
# Recursive case: x has 3 or more distinct values and contains repeated
# values.
FirstVsRest_table <- x_table %>%
group_by(x = x != x[1]) %>%
summarise(count = sum(count), .groups = "drop")
rest_table <- x_table[-1,]
FirstVsRest_perms <- all_unique_perms(x_table = FirstVsRest_table)
rest_perms <- all_unique_perms(x_table = rest_table)
full_perms <- matrix(
x_table$x[1],
ncol = ncol(FirstVsRest_perms),
nrow = nrow(FirstVsRest_perms) * nrow(rest_perms))
for (FvR_row in seq_len(nrow(FirstVsRest_perms))) {
for (rest_row in seq_len(nrow(rest_perms))) {
dest_row <- (FvR_row-1) * nrow(rest_perms) + rest_row
full_perms[dest_row, FirstVsRest_perms[FvR_row,]] <- rest_perms[rest_row,]
}
}
return(full_perms)
}
}
I found the
Permn
function in this package, which handles repeated values by only returning unique permutations. However, I looked at the implementation, and it still expands the full N! permutations before deduplicating them, which makes it very slow even in certain cases where the return value is not very large, e.g.Permn(c(rep(1, 10), 2))
. I made an implementation of this that runs in time approximately proportional to the size of the output, but it doesn't currently support the bells and whistles ofPermn
, and the interface and implementation are a bit rough. If you are interested in using this, please let me know and I will clean it up and create a pull request. Or you can adapt the algorithm on your own.