smbache / loggr

Easy and flexible logging for R
Other
79 stars 6 forks source link

Easier message generation #13

Closed Mullefa closed 9 years ago

Mullefa commented 9 years ago

Maybe this needs consideration outside the development of loggr, but it would be great if the logging functions supported string interpolation e.g.

...
user_name = "Alice"
log_info("User $user_name has logged in")

or failing that:

...
user_name = "Alice"
log_info("User $user_name has logged in", user_name = user_name)
smbache commented 9 years ago

I agree

smbache commented 9 years ago

How about introducing a utility function say string_fill ala in the example below (function def first, then a small example). Not sure if this should be default behavior and/or optional though. Will think some more. Also, function name and description may not be optimal...

#' Fill String with Named Values
#'
#' This function substitutes name placeholders with the values found in the
#' specified environment. Placeholders are names enclosed in backticks, e.g. 
#' `name`. It is possible to specify formats as with \code{\link{sprintf}}, e.g. 
#' `value%.2f`; here the name is \code{value} and the format is \code{%.2f}.
#'
#' @details It is not possible to use e.g. `name[1]` or other forms of indexing.
#'   
#' @param string A character string with name placeholders.
#' @param env An environment where to search for the values.
#'
#' @return A character string
string_fill <- function(string, env = parent.frame())
{
  # Look for names enclosed by backticks, e.g. `name`. Ignore doubles, ``.
  pattern    <- "(?<!`)`(?!`)(.*?)((?<!`)`(?!`))"
  match_data <- gregexpr(pattern, string, perl = TRUE)
  matches    <- regmatches(string, match_data)[[1]]

  # Extract the names
  names      <- gsub("(%.*)|`", "", matches)

  # Fetch the values from the given environment.
  values     <- lapply(gsub("(%.*)|`", "", matches), get, 
                       env = env, inherits = TRUE)

  # Extract formats; default to %s if not given.
  formats    <- vapply(strsplit(gsub("`", "", matches), "%"), 
                       function(m) if (length(m) == 1) "s" else m[length(m)], 
                       character(1))

  # Create the replacements from the values and formats
  replacements <- mapply(sprintf, paste0("%", formats), values)

  # Perform the replacement
  regmatches(string, match_data) <- list(replacements)

  return(string)
}

user_name <- "Stefan"
account   <- 1337
ammount   <- 66.6

msg <- 
  string_fill("User `user_name` has account `account%d` with $`ammount%.2f`.")

print(msg)
[1] "User Stefan has account 1337 with $66.60."
Mullefa commented 9 years ago

Hey,

Personally I would love it if string interpolation came as default, the reasoning being that logging is useful (so should be encouraged), but boring (so should be made as easy as possible).

Perhaps the log functions could have an argument specifying whether string interpolation should occur with default set to TRUE e.g.

log_info(msg, interp_string = TRUE, env = parent.frame())

I've been having a quick play around with a possible function too. I've used { to enclose variable references (taking inspiration from Scala and Ruby). It supports evaluating the variable references but not formatting so maybe some happy marriage of the two? (It's also probably very buggy ;))

# function

string_interp <- function(string, env = parent.frame()) {
  # get variable references
  idxs <- gregexpr("\\$\\{.*?\\}", string)
  matches <- regmatches(string, idxs)[[1]]
  var_refs <- gsub("[\\$\\{\\}]", "", matches)

  # get strings of variable references
  exprs <- parse(text = var_refs)
  out <- lapply(exprs, eval, env = env)
  var_strings <- vapply(out, as.character, character(1))

  # replace references with strings
  for (var_string in var_strings) {
    idx <- regexpr("\\$\\{.*?\\}", string)
    regmatches(string, idx) <- var_string
  }

  string
}

# examples

# 1.
user_name <- "Alice"
question <- "doing today"
string_interp("hello ${user_name} how are you ${question}?")

# 2.
string_interp("2 + 2 = ${2 + 2}")

# 3.
github_users <- c("smbache", "Mullefa")
string_interp("The first github user in the list is: ${github_users[1]}")
smbache commented 9 years ago

Indeed.. Not sure what a "nice" syntax would be that includes formatting; however I would much like that to possible!

smbache commented 9 years ago

Maybe

string_interp("A random number is $.2f{rnorm(1)}.")

...

Mullefa commented 9 years ago

For me, if the formatter was optional, something like that would be absolutely perfect, although I'd definitely try and get the views of some R users with a bit more experience too.

As an aside, something like this would make logging Shiny applications incredibly easy :D

log_info("user selected ${input$choice}")

(although this doesn't currently work due to gsub() replacing all the $)

Mullefa commented 9 years ago

Also, considering the general utility of the function, I would maybe go for a name such as s(), again inspired by Scala's syntax which I think is neat:

// Scala
val name = "James"
println(s"Hello $name")
# R
name = "James"
cat(s("Hello ${name}\n"))
richfitz commented 9 years ago

Can I suggest that this behaviour would be enormously useful in general, not just in logging. So it might make sense to make a separate package for the interpolation that nails that (especially scoping which might be hard to get right). Then in loggr make a formatter function that uses the interpolation.

smbache commented 9 years ago

@richfitz on the one hand it would; on the other a package for one function seems excessive? But maybe.. Could call it stringterpolatr ;)

Why would scoping be hard? Can you come up with an example?

richfitz commented 9 years ago

There's already whisker, which basically has one function (whisker.render), and a super lightweight string interpolation package would be in direct competition with that.

Perhaps it's not hard. This stuff does my head in though, when moving from the global environment to environments of functions or functions within functions etc.

If the string "Hello ${x}\n" is passed into log_info we need to make sure that the 'x' that is found is in the appropriate environment (parent.frame() most likely). However, we'd not be able to access that from the formatter easily as all log_info does is raise a signal.

Perhaps if the event captures the correct frame it'd all be easy. That might not be a terrible thing to have access to anyway.

Mullefa commented 9 years ago

I wouldn't imagine scoping would be too hard (especially for the guy who wrote %>% ;)). For illustrative purposes, say the final function has signature s(string, env = parent.frame()) then e.g. log_info() would look something like:

log_info <- function(msg, interp_string = TRUE, env = parent.frame()) {
  if (interp_string) {
    message <- s(message, env)
  }
  event <- log_event("INFO", message)
  invisible(signalCondition(event))
}

Regarding whisker.render(), a couple of points that count against it:

smbache commented 9 years ago

@richfitz is your whisker argument for or against a separate package? Would the aim/scope be identical here?

richfitz commented 9 years ago

I'm not saying to use whisker here - I'm saying whisker exists as a package of essentially one function, that sees wide use (so: support of a separate package).

smbache commented 9 years ago

Is there a unary operator one could use, e.g. -"We want ${wish}!"? Then one could have both a function of the string and env, and the operator would then call this with parent.frame()

richfitz commented 9 years ago

The (allegedly) complete list is here though it doesn't include := so might be missing something.

Can "~" be bent to this purpose perhaps? E.g. log_info(~"my ${string}")

richfitz commented 9 years ago

Looks like it: proof of concept (using whisker for the rendering but that would be swapped out for something nicer)

f <- function(x) {
  whisker::whisker.render(x[[2]], attr(x, ".Environment"))
}
string <- "x"
f(~"my {{string}}") # "my string"

in an function showing scoping:

g <- function(x) {
  y <- x
  f(~"my {{y}}")
}
g("foo") # "my foo"
smbache commented 9 years ago

It would only work if log_info made it work, e.g. this would never work cat(~"my ${string}"). But the function string_interp (or whatever) could exist (in a package or in loggr) and then log_info could dispatch on formula vs character.

Mullefa commented 9 years ago

:+1: for the suggestion which makes the function the identity for regular strings (if indeed that is what you were getting at). The benefit being that it simplifies the signature of functions such as log_info() which would allow for string interpolation, but not assume it:

x <- "world"
log_info("hello world")
log_info(~"hello ${x}")
smbache commented 9 years ago

I had some code all ready to show you, but then I forgot it on my home computer. It allows:

user <- "smbache"
number <- 1.1234

string_interp("Some nasty string by ${user}, including nested (but OK) '}' and allowing numbers like $[.2f]{2*{number}} to be formatted")

[1] "Some nasty string by smbache, including nested (but OK) '}' and allowing numbers like 1.12 to be formatted"

The example above is really meant to show that certain potential issues have been taken into account. I will wrap it up at some point and you can see what you think of it.

Mullefa commented 9 years ago

Excellent work sir - I'd say you've now got two functions that could feasibly be in base R ;)

Such is the use of this function (at work especially I would use it a hella a lot), would you consider putting it in its own package? Or maybe doing a merge with @hadley's stringr if he thought it was suitable?

smbache commented 9 years ago

I'm open to any of those options.

hadley commented 9 years ago

I'd be happy to include in stringr

smbache commented 9 years ago

@hadley , I'll make a gist later with some code and you can see if it is acceptible in its general lines; if so, I can make a PR for stringr.

smbache commented 9 years ago

only downside: stringterpolatr was a pretty cool pkg name

smbache commented 9 years ago

So, I created a gist: https://gist.github.com/smbache/7253e88adb540a4f8007

Here, you'll see example implementation and examples @ the bottom! I think it is pretty nice; although I'm not aware if there is an (even more ;)) clever way of matching. Anyways, @hadley let me know if you "approve" of the approach in terms of stringr; if not we can (a) change it, (b) go with a separate pkg. Otherwise I can PR ;)

Take into consideration:

Mullefa commented 9 years ago

Naming is always contentious - I really like s(), but this might not be everyone's bag. I'd suggest we try and get some input from some of the regular R developers on github. @hadley what do you think?

smbache commented 9 years ago

s is used for smoothing splines in many model interfaces...

hadley commented 9 years ago

I like it!

lionel- commented 9 years ago

looks nice. What about using % instead of $, as it would be reminiscent of formatting functions in other languages?

smbache commented 9 years ago

@hadley what would be your preference for name and symbol? It's your package ;)

hadley commented 9 years ago

I don't have particularly strong feelings either way

kristang commented 9 years ago

What about strinter?

kirillseva commented 9 years ago

This package (not on CRAN yet) has similar functionality, inspired by ruby's string interpolation

user_name <- "World"
productivus::pp("Hello #{user_name}!")
# [1] "Hello World!"
robertzk commented 9 years ago

Except that I didn't support recursive parsing; the following will fail:

x <- 1
productivus::pp("x is #{productivus::pp('#{x}!')}")

I can expand to include this functionality with a slightly more careful parser if there is interest (not that the current implementation is the most efficient). smbache's implementation seems to cover these cases, and look slightly cleaner (although also looks slightly like PHP!)

smbache commented 9 years ago

One argument for $ is that one can think of it as indexing env, as in env$user_name... But then allowing more complex expressions than just names.

Mullefa commented 9 years ago

:+1:

smbache commented 9 years ago

Merged here; we'll leave this open and it seems this will be incorporated...

smbache commented 9 years ago

I've included this temporarily in loggr (until we can import from stringr). In the latest push you can interpolate, see also example in this thread: https://github.com/smbache/loggr/issues/17

Mullefa commented 9 years ago

OK. Cheers for the update.

smbache commented 9 years ago

The interpolation is now included in loggr until in a stable stringr release.