Closed Mullefa closed 9 years ago
I agree
How about introducing a utility function say string_fill
ala in the example below (function def first, then a small example). Not sure if this should be default behavior and/or optional though. Will think some more. Also, function name and description may not be optimal...
#' Fill String with Named Values
#'
#' This function substitutes name placeholders with the values found in the
#' specified environment. Placeholders are names enclosed in backticks, e.g.
#' `name`. It is possible to specify formats as with \code{\link{sprintf}}, e.g.
#' `value%.2f`; here the name is \code{value} and the format is \code{%.2f}.
#'
#' @details It is not possible to use e.g. `name[1]` or other forms of indexing.
#'
#' @param string A character string with name placeholders.
#' @param env An environment where to search for the values.
#'
#' @return A character string
string_fill <- function(string, env = parent.frame())
{
# Look for names enclosed by backticks, e.g. `name`. Ignore doubles, ``.
pattern <- "(?<!`)`(?!`)(.*?)((?<!`)`(?!`))"
match_data <- gregexpr(pattern, string, perl = TRUE)
matches <- regmatches(string, match_data)[[1]]
# Extract the names
names <- gsub("(%.*)|`", "", matches)
# Fetch the values from the given environment.
values <- lapply(gsub("(%.*)|`", "", matches), get,
env = env, inherits = TRUE)
# Extract formats; default to %s if not given.
formats <- vapply(strsplit(gsub("`", "", matches), "%"),
function(m) if (length(m) == 1) "s" else m[length(m)],
character(1))
# Create the replacements from the values and formats
replacements <- mapply(sprintf, paste0("%", formats), values)
# Perform the replacement
regmatches(string, match_data) <- list(replacements)
return(string)
}
user_name <- "Stefan"
account <- 1337
ammount <- 66.6
msg <-
string_fill("User `user_name` has account `account%d` with $`ammount%.2f`.")
print(msg)
[1] "User Stefan has account 1337 with $66.60."
Hey,
Personally I would love it if string interpolation came as default, the reasoning being that logging is useful (so should be encouraged), but boring (so should be made as easy as possible).
Perhaps the log functions could have an argument specifying whether string interpolation should occur with default set to TRUE
e.g.
log_info(msg, interp_string = TRUE, env = parent.frame())
I've been having a quick play around with a possible function too. I've used {
to enclose variable references (taking inspiration from Scala and Ruby). It supports evaluating the variable references but not formatting so maybe some happy marriage of the two? (It's also probably very buggy ;))
# function
string_interp <- function(string, env = parent.frame()) {
# get variable references
idxs <- gregexpr("\\$\\{.*?\\}", string)
matches <- regmatches(string, idxs)[[1]]
var_refs <- gsub("[\\$\\{\\}]", "", matches)
# get strings of variable references
exprs <- parse(text = var_refs)
out <- lapply(exprs, eval, env = env)
var_strings <- vapply(out, as.character, character(1))
# replace references with strings
for (var_string in var_strings) {
idx <- regexpr("\\$\\{.*?\\}", string)
regmatches(string, idx) <- var_string
}
string
}
# examples
# 1.
user_name <- "Alice"
question <- "doing today"
string_interp("hello ${user_name} how are you ${question}?")
# 2.
string_interp("2 + 2 = ${2 + 2}")
# 3.
github_users <- c("smbache", "Mullefa")
string_interp("The first github user in the list is: ${github_users[1]}")
Indeed.. Not sure what a "nice" syntax would be that includes formatting; however I would much like that to possible!
Maybe
string_interp("A random number is $.2f{rnorm(1)}.")
...
For me, if the formatter was optional, something like that would be absolutely perfect, although I'd definitely try and get the views of some R users with a bit more experience too.
As an aside, something like this would make logging Shiny applications incredibly easy :D
log_info("user selected ${input$choice}")
(although this doesn't currently work due to gsub()
replacing all the $
)
Also, considering the general utility of the function, I would maybe go for a name such as s()
, again inspired by Scala's syntax which I think is neat:
// Scala
val name = "James"
println(s"Hello $name")
# R
name = "James"
cat(s("Hello ${name}\n"))
Can I suggest that this behaviour would be enormously useful in general, not just in logging. So it might make sense to make a separate package for the interpolation that nails that (especially scoping which might be hard to get right). Then in loggr
make a formatter function that uses the interpolation.
@richfitz on the one hand it would; on the other a package for one function seems excessive? But maybe.. Could call it stringterpolatr
;)
Why would scoping be hard? Can you come up with an example?
There's already whisker, which basically has one function (whisker.render
), and a super lightweight string interpolation package would be in direct competition with that.
Perhaps it's not hard. This stuff does my head in though, when moving from the global environment to environments of functions or functions within functions etc.
If the string "Hello ${x}\n" is passed into log_info
we need to make sure that the 'x' that is found is in the appropriate environment (parent.frame()
most likely). However, we'd not be able to access that from the formatter easily as all log_info
does is raise a signal.
Perhaps if the event captures the correct frame it'd all be easy. That might not be a terrible thing to have access to anyway.
I wouldn't imagine scoping would be too hard (especially for the guy who wrote %>%
;)). For illustrative purposes, say the final function has signature s(string, env = parent.frame())
then e.g. log_info()
would look something like:
log_info <- function(msg, interp_string = TRUE, env = parent.frame()) {
if (interp_string) {
message <- s(message, env)
}
event <- log_event("INFO", message)
invisible(signalCondition(event))
}
Regarding whisker.render()
, a couple of points that count against it:
whisker.render("{{2 + 2}}")
@richfitz is your whisker argument for or against a separate package? Would the aim/scope be identical here?
I'm not saying to use whisker here - I'm saying whisker exists as a package of essentially one function, that sees wide use (so: support of a separate package).
Is there a unary operator one could use, e.g. -"We want ${wish}!"
? Then one could have both a function of the string and env, and the operator would then call this with parent.frame()
The (allegedly) complete list is here though it doesn't include :=
so might be missing something.
Can "~" be bent to this purpose perhaps? E.g. log_info(~"my ${string}")
Looks like it: proof of concept (using whisker for the rendering but that would be swapped out for something nicer)
f <- function(x) {
whisker::whisker.render(x[[2]], attr(x, ".Environment"))
}
string <- "x"
f(~"my {{string}}") # "my string"
in an function showing scoping:
g <- function(x) {
y <- x
f(~"my {{y}}")
}
g("foo") # "my foo"
It would only work if log_info
made it work, e.g. this would never work cat(~"my ${string}")
. But the function string_interp
(or whatever) could exist (in a package or in loggr) and then log_info could dispatch on formula vs character.
:+1: for the suggestion which makes the function the identity for regular strings (if indeed that is what you were getting at). The benefit being that it simplifies the signature of functions such as log_info()
which would allow for string interpolation, but not assume it:
x <- "world"
log_info("hello world")
log_info(~"hello ${x}")
I had some code all ready to show you, but then I forgot it on my home computer. It allows:
user <- "smbache"
number <- 1.1234
string_interp("Some nasty string by ${user}, including nested (but OK) '}' and allowing numbers like $[.2f]{2*{number}} to be formatted")
[1] "Some nasty string by smbache, including nested (but OK) '}' and allowing numbers like 1.12 to be formatted"
The example above is really meant to show that certain potential issues have been taken into account. I will wrap it up at some point and you can see what you think of it.
Excellent work sir - I'd say you've now got two functions that could feasibly be in base R ;)
Such is the use of this function (at work especially I would use it a hella a lot), would you consider putting it in its own package? Or maybe doing a merge with @hadley's stringr if he thought it was suitable?
I'm open to any of those options.
I'd be happy to include in stringr
@hadley , I'll make a gist later with some code and you can see if it is acceptible in its general lines; if so, I can make a PR for stringr.
only downside: stringterpolatr was a pretty cool pkg name
So, I created a gist: https://gist.github.com/smbache/7253e88adb540a4f8007
Here, you'll see example implementation and examples @ the bottom! I think it is pretty nice; although I'm not aware if there is an (even more ;)) clever way of matching. Anyways, @hadley let me know if you "approve" of the approach in terms of stringr
; if not we can (a) change it, (b) go with a separate pkg. Otherwise I can PR ;)
Take into consideration:
${
which must be the start of a placeholder.stringterpolate
because it is fun and I didn't have a good name. Inputs as to what would be best? I think I would prefer a short one for this kind of function... Although consuming functions, e.g. log_info
could use e.g. ~string
arguments and provide end user with a very short-hand interface.Naming is always contentious - I really like s()
, but this might not be everyone's bag. I'd suggest we try and get some input from some of the regular R developers on github. @hadley what do you think?
s
is used for smoothing splines in many model interfaces...
I like it!
looks nice. What about using %
instead of $
, as it would be reminiscent of formatting functions in other languages?
@hadley what would be your preference for name and symbol? It's your package ;)
I don't have particularly strong feelings either way
What about strinter?
This package (not on CRAN yet) has similar functionality, inspired by ruby's string interpolation
user_name <- "World"
productivus::pp("Hello #{user_name}!")
# [1] "Hello World!"
Except that I didn't support recursive parsing; the following will fail:
x <- 1
productivus::pp("x is #{productivus::pp('#{x}!')}")
I can expand to include this functionality with a slightly more careful parser if there is interest (not that the current implementation is the most efficient). smbache's implementation seems to cover these cases, and look slightly cleaner (although also looks slightly like PHP!)
One argument for $
is that one can think of it as indexing env
, as in env$user_name
... But then allowing more complex expressions than just names.
:+1:
Merged here; we'll leave this open and it seems this will be incorporated...
I've included this temporarily in loggr (until we can import from stringr). In the latest push you can interpolate, see also example in this thread: https://github.com/smbache/loggr/issues/17
OK. Cheers for the update.
The interpolation is now included in loggr
until in a stable stringr
release.
Maybe this needs consideration outside the development of
loggr
, but it would be great if the logging functions supported string interpolation e.g.or failing that: