PolMine / duplicates

0 stars 0 forks source link

turn minimize_vocabulary() into charfilter() #6

Closed ablaette closed 1 year ago

ablaette commented 1 year ago

Make function more generic and independent from polmineR: Should be a function with character vector as input that drops all characters not allowed.

charfilter <- function(x, chars){
  vocab <- lapply(
    strsplit(x, ""),
    function(x) paste(ifelse(x %in% chars, x, ""), collapse = "")
  )
  unlist(vocab, recursive = FALSE)
}
ablaette commented 1 year ago

Concerning naming: CharFilter is common in Spark, and Django.