gagolews / stringi

Fast and portable character string processing in R (with the Unicode ICU)
https://stringi.gagolewski.com/
Other
304 stars 44 forks source link

Make `stri_sub()<-` pipable #338

Closed BastienFR closed 5 years ago

BastienFR commented 5 years ago

Following a discussion from: https://stackoverflow.com/questions/13863599/insert-a-character-at-a-specific-location-in-a-string/45738272#45738272, it would be nice if the call stri_sub()<- could be pipable.

Right now, stri_sub allow to insert a charater within a string with:

library(stringi)

vv1 <- "abde"

stri_sub(vv1, 3, 2) <- "C"
vv1
[1] "abCde"

However this call can't be put easily into a piping workflow. Would be nice if it could.

gagolews commented 5 years ago

I'm afraid (to the best of my knowledge, but I might be wrong) that there's no mechanism in R to allow replacement functions to work that way.

BastienFR commented 5 years ago

Well, it can be done with base R:

library(magrittr)

vv2 <- "abde"
vv2 %>% 
sub( '(?<=.{2})', 'C', ., perl=TRUE )
[1] "abCde"

So it's not an really needed improvement from stringi, it would just be a nice feature so with don't have to go into regexp semantic.

gagolews commented 5 years ago

Could you post a self-contained example that you would like me to get up and running?

BastienFR commented 5 years ago

Well, the example in my second comment is as generic at it comes for a reproducible example. however, if you want to see more a "real life" example that leads me to this, here is my actual code:

library(dplyr)

## lots of piped line code to produce this `ll` object:
ll <- list(
  B1W = c("12170348", "12170346", "12170347", "12170349", "12170350"),
  L5P = "35211696",
  S4K = "47060689",
  S4M = "47060694",
  S7B = c("47110592", "47110594"),
  S7C = c("47110583", "47110587", "47110633")
)

## my `ll` object is into a pipe which does multiple things
## the first mutate is where I pipe a `sub` command where it would be nice to use `stringi`
ll %>% 
  unlist %>% 
  data.frame(CODEPOST=names(.), DA_2016=., stringsAsFactors = F) %>% 
  mutate(CODEPOST = sub("(?<=.{3})", "XX", CODEPOST, perl=T)) %>% 
  mutate(CODEPOST = ifelse(nchar(CODEPOST)==5, paste0(CODEPOST,1),CODEPOST))

I hope it help.

P.S. it's @bartektartanus that suggest me to post a issue on that subject.

yutannihilation commented 5 years ago

Maybe you are not aware of stri_replace_*()?

library(stringi)
library(dplyr, warn.conflicts = FALSE)

## lots of piped line code to produce this `ll` object:
ll <- list(
  B1W = c("12170348", "12170346", "12170347", "12170349", "12170350"),
  L5P = "35211696",
  S4K = "47060689",
  S4M = "47060694",
  S7B = c("47110592", "47110594"),
  S7C = c("47110583", "47110587", "47110633")
)

## my `ll` object is into a pipe which does multiple things
## the first mutate is where I pipe a `sub` command where it would be nice to use `stringi`
ll %>% 
  unlist %>% 
  data.frame(CODEPOST=names(.), DA_2016=., stringsAsFactors = F) %>% 
  mutate(CODEPOST = stri_replace_first_regex(CODEPOST, "(?<=.{3})", "XX")) %>% 
  mutate(CODEPOST = ifelse(nchar(CODEPOST)==5, paste0(CODEPOST,1),CODEPOST))
#>    CODEPOST  DA_2016
#> 1    B1WXX1 12170348
#> 2    B1WXX2 12170346
#> 3    B1WXX3 12170347
#> 4    B1WXX4 12170349
#> 5    B1WXX5 12170350
#> 6    L5PXX1 35211696
#> 7    S4KXX1 47060689
#> 8    S4MXX1 47060694
#> 9    S7BXX1 47110592
#> 10   S7BXX2 47110594
#> 11   S7CXX1 47110583
#> 12   S7CXX2 47110587
#> 13   S7CXX3 47110633

Created on 2019-01-14 by the reprex package (v0.2.1)

BastienFR commented 5 years ago

Interesting but I don't see the gain over sub in that case. You still have to use regular expression.

yutannihilation commented 5 years ago

Ah, sorry, I misunderstood your intension. I just wanted to show stringi can do what base R can do.

yutannihilation commented 5 years ago

Is this what you want, right?

vv1 <- "abde"
vv1 %>%
  stri_sub_replace(3, 2, value = "C")

@gagolews I made a quick PR (sorry, I don't have time to build and test this for now). I think we can just create renamed version of stri_sub<-, though I don't feel this is very useful itself. Do you feel this is worth implementing?

https://github.com/gagolews/stringi/pull/339/files

BastienFR commented 5 years ago

@yutannihilation, yes, this seems perfect and simple. Now if you consider worth it to implement officially, it's your call. It's not like we lack other options, but it makes it simpler for sure.

yutannihilation commented 5 years ago

Thanks.

Sorry, I forgot that we can use replacement functions in this way:

library(stringi)
library(magrittr)

vv1 <- "abde"
vv1 %>%
   `stri_sub<-`(3, 2, value = "C")
#> [1] "abCde"

Created on 2019-01-15 by the reprex package (v0.2.1)

Aliasing the function might make the code more readable.

stri_sub_replace <- `stri_sub<-`
gagolews commented 5 years ago

Yep, sure, I can add an alias, thanks, guys!

yutannihilation commented 5 years ago

Thanks!