gagolews / stringi

Fast and portable character string processing in R (with the Unicode ICU)
https://stringi.gagolewski.com/
Other
304 stars 44 forks source link

stri_replace_all_fixed works iteratively / overlapping/cycling maps with vectorize_all=FALSE #349

Closed jakob-r closed 5 years ago

jakob-r commented 5 years ago

Consider the following example:

stringi::stri_replace_all_fixed(c("A", "B", "A"), c("A", "B"), c("B", "C"), vectorize_all = FALSE)

The output is:

[1] "C" "C" "C"

Instead I would expect

[1] "B" "C" "B"

Is this intended or a bug?

gagolews commented 5 years ago

Nah, this behavior is documented in the package manual:

[...]this is equivalent to something like ‘for (i in 1:npatterns) str <- stri_replace_all(str, pattern[i], replacement[i]’. [...]

Recently a similar issue was raised regarding stri_trans_char and overlapping maps, see #343

so in your example, you'd have:

> stringi::stri_trans_char(c("A", "B", "A"), "AB", "BC")
[1] "B" "C" "B"

Reproducing a similar behavior in case of stri_replace_all could be error-prone, I think, because the patterns might be of different lengths and the matches to patterns might overlap -- what then?