tidyverts / tsibble

Tidy Temporal Data Frames and Tools
https://tsibble.tidyverts.org
GNU General Public License v3.0
531 stars 49 forks source link

Allow single keys to be passed without id() #58

Closed DavisVaughan closed 6 years ago

DavisVaughan commented 6 years ago

How do you feel about allowing users to pass in the key without requiring them to use id()? I often find myself with just 1 key level, and feel like I should just be able to do:

as_tsibble(FANG, key = symbol, index = date)

As a new user, I might be confused by the power that id() gives me (crossing and nesting) when all i want to do is work with a basic tsibble with 1 layer of keys. You've made it fairly clear in the docs that you need to use id(), which is good, but it's just not my first instinct. Allowing me to do key = symbol and then informing me later on (once my interest is piqued) that I can group by multiple levels and so on with id() seems slightly more user friendly to me.

On first glance, it looks like all that would need to be changed is the safely(eval_tidy) call in use_id() would need to become safely(conditional_eval_tidy) where conditional_eval_tidy could look like:

conditional_eval_tidy <- function(expr, data = NULL, env = caller_env()) {
  if(is.name(expr)) {
    return(list(expr))
  }

  else {
    eval_tidy(expr = expr, data = data, env = env)
  }
}

Essentially it just catches the case where the expression returned by get_expr(key) is a name and not a call. The rest of use_id() would then work fine.

earowang commented 6 years ago

Thanks for the feedback.

  1. I don't think the learning curve of id() is steep. It's a short function name and subconsciously helps to enhance the understanding of key as identifying variables. Assuming some users have used dplyr::vars() before, they would find id() just bind variables together as a helper.
  2. Sooner or later, users will run into data applications involving multiple variables. I find myself more often run into these kinds of data now. For example, spatio-temporal data usually have lat and lon as key. The key = id() concept is better to be introduced in the beginning.
  3. I agree that id() isn't the first instinct, but I'd like to keep interface consistent and implementation simpler. If there's no explicit key, it's key = id() suggesting there's a key, but implicit in the data. If changing key to accept a bare variable without id(), I'd probably change key = NULL as default too, but this doesn't make sense to me. Generally speaking, id() is semantically important to a tsibble.
  4. What I could do is to give a more informative error "Do you need key = id(symbol) to create tbl_ts?"
DavisVaughan commented 6 years ago

This all seems fair to me. If you add the more informative error suggesting exactly what the user should do, I think that would make me happy. Thanks for taking some time to respond.

I think if you can detect that the user input is a name object, then you could throw that specific error.

Also, just ran into this, which I doubt you want!

library(tsibble)
ex <- data.frame(group = 1:2, date = Sys.Date(), var = .1)
as_tsibble(ex, key = "group")
#> The `index` is `date`.
#> # A tsibble: 2 x 3 [?]
#> # Key:       group [2]
#>   group date         var
#>   <int> <date>     <dbl>
#> 1     1 2018-09-11   0.1
#> 2     2 2018-09-11   0.1

Created on 2018-09-11 by the reprex package (v0.2.0).

earowang commented 6 years ago

Thanks. All done.