moodymudskipper / tag

Build function operator factories supporting the tag$function(args) notation
GNU General Public License v3.0
13 stars 1 forks source link

Initial Impressions #4

Open ArtemSokolov opened 5 years ago

ArtemSokolov commented 5 years ago

Thank you so much for taking the time to put this together. I think this is a great resource, and I’m very excited to see where it will go. Below I have some specific comments that I put together while going through your write-up.

I completely agree with you that adverbs are underutilized, but I think the main reason is the lack of formalism. Furthermore, there is no package centralizing adverb usage (the way, for example, rlang centralizes non-standard evaluation). It's all scattered around various packages, some of which you noted in your README. I think you have the beginnings of something that can become a central adverb resource by 1) importing relevant adverbs from other packages and 2) defining your own set of adverbs, as you've done here.

I would bet that most developers who become comfortable with functions being first class objects probably use adverbs in some capacity without realizing it. You can think about starting your README with a representative example showing a code pattern that developers probably already use (e.g., purrr::safely) and how it can be accomplished more cleanly using your tag system.

Naming conventions

I agree with Nathan’s comment that names beginning with . tend to have an “internal variable” connotation, but none of the proposed alternatives seem very aesthetic to me. One idea is to write adverb tags as… well, adverbs. For example, instead of

..bang$transform(head(cars,2), !!w := !!v / !!sym(u), !!!x)

we can write

tidily$transform(head(cars,2), !!w := !!v / !!sym(u), !!!x)

Similarly, instead of

..lbd$mutate(head(iris,2), Petal.Width = ~1000*(.), Species = ~toupper(.))

we could do

succinctly$mutate(head(iris,2), Petal.Width = ~1000*(.), Species = ~toupper(.))

I think it reads quite naturally and has the advantage of being in line with adverbs already in the wild (e.g., purrr::safely). You also have the option of getting creative (grouply in place of ..grp?) to avoid name collisions.

Using . inside lambda functions

There is a potential conflict in resolving . inside your ..lbd. When mutate is used in a pipe, . is occasionally used to reference the data frame itself. For example, consider

iris %>% mutate( Petal.Width = nrow(.)*Petal.Width )

Following your ..lbd example, we would have

iris %>% ..lbd$mutate( Petal.Width = ~nrow(.)*(.) )

which creates conflicting interpretations of ..

Extending the ..w tag

I think that you can extend the ..w tag to be a fully fledged adverb by chaining it with its argument. For example,

with( iris, t.test( Sepal.Length, Sepal.Width ) )

might look like

..w$iris$t.test( Sepal.Length, Sepal.Width )

or, if you decide to change the names to adverbs,

using$iris$t.test( Sepal.Length, Sepal.Width)

Minor comments

..at usage feels a bit messy. I don’t have a good idea of how to make it better (yet), but relative to other adverbs, it doesn’t seem to flow as well.

Assignment in place via ..ip is somewhat counter intuitive because it breaks copy-on-modify semantics. I’m worried that it might encourage developers to write code that doesn’t play nicely with the functional paradigm. Additionally, this adverb will need extensive testing to ensure it behaves correctly in nested functions, environments, pipes, etc.

Consider creating a different example for ..fn, as it is not clear why ..fn$sapply(iris, ~class(.)) would be preferred over summarize_all( iris, class ).

..fs is the best thing since sliced bread, as it combines the functionality of purrr::partial and purrr::compose. It’s all I’ve ever needed in life. ;)

In the section discussing adding new adverbs, it’s unclear what you mean by “composition” when referring to the “adverb” class. Consider providing an example.

I think that chaining of adverbs ..View$..grp$summarize_all(...) looks cleaner than nested calling ..View(..grp$summarize_all)(...), since one of the goals of using adverb tags in the first place is to deal with a “parenthesis overload”.

moodymudskipper commented 5 years ago

These are great comments thanks!

Here's my reaction to them, this will take some time for me to digest this all, and I'll need to look back at this with a fresh mind.

Start on clean foundation

If wer'e going to make adverbs easier to understand, consistency must come first, I'll ditch functions that are not adverbs, and those that have hacky code or no value, so good bye to :

Extending ..w

I am not convinced by using$iris$t.test( Sepal.Length, Sepal.Width) as iris is not very visible and using is not a proper adverb, however using could be a meta-adverb so that using(iris) (no dollar shorthand) would be an adverb.

So we could have

using(iris)$t.test(Sepal.Length, Sepal.Width)

equivalent to:

with(iris, t.test(Sepal.Length, Sepal.Width))

I like to subset with [ though and I like the idea that [ implies with as in data.table, so I might give the output of using a class "using" (or a better name if it comes up) enabling [.using so I can do :

some_vec <- setNames(1:5,letters[1:5])
using(some_vec)[. > b]
# [1] 3 4 5
# rather than : 
some_vec[some_vec > some_vec["b"]]

Re using . in lambda functions

You are right about the conflict but I don't think it's different than doing df %>% mutate(col = map(col, ~fun(. , arg)) nor that it is more confusing. I could use the lhs of the formula but then I depart from rlang and I'm not sure it's worth it :

iris %>% ..lbd$mutate( Petal.Width = x ~nrow(.)* x )

better examples

examples for ..fn will be changed using ave or combn. And package will be introduced with a purrr adverb such as safely.

support adverb chaining

And remove reference to adverb composition, which is confusing.

I meant that to make adv1$adv2$fun work I need (adv1$adv2)$fun to do the same as adv1(adv2$fun) , so adv1$adv2 composes adverbs instead of the standard behavior.

Note that it means that one can't use adverbs on adverbs unless they do some class surgery, but I don't think it would be common enough to be a problem.

Formalism and naming conventions

I agree with you, good formalism will go a long way.

Sadly I feel like it's the hardest part.

I agree to use actual adverb names rather than a .. prefix but I think additional guidance is necessary to choose names, trying to name them like adverbs, even with a poetic license, is not enough in my opinion.

Colin Fay likes with_*, I like it too, but he still can't manage to avoid awkward names like talky. using_* is also a good format.

Or we really push it and suffix all with _ly : safe_ly, group_ly, or Ly (with capital L) : safeLy, groupLy. A Negate equivalent would be negLy / neg_ly/, we don't need to end up with awkward long names.

purrr defines :

"Adverbs modify the action of a function; taking a function as input and returning a function with modified action as output."

But it doesn't say why we need that and what makes a good or a poor adverb, and I'm still figuring this out myself :).

Maybe tags (package) and "tag" (class) can just become adverbs and "adverb" once it's cleaned up.

Additional thoughts on $.tag / $.adverb

So far the function has to have exactly one parameter to work with $.tag, but $.tag could automatically forward all additional parameters to the output function. so we can just do safeLy <- as_adverb(purrr::safely) and it'd be ready to use with $ notation.

Code of modified functions

Should be more readable Modified functions should keep original parameters as much as possible rather than having only ... like in purrr.

ArtemSokolov commented 5 years ago

Hi Antoine,

Here are some more thoughts about your suggestions:

  1. You're right about using(iris)$t.test(Sepal.Length, Sepal.Width) looking cleaner. This also provides a potential option for adding arguments to using. I think you originally mentioned that adverbs generally don't receive arguments directly, but I'm curious if there's a use case where it might be advantageous. For example, quietly could have separate flags (all TRUE by default) for suppressing warnings, errors and messages:
quietly( err=FALSE )$load_csv( "myfile.csv" )    ## Display errors only
  1. Personally I feel that _ly or Ly is a better alternative than with_* or using_*. To me, it feels more natural. (And there is less typing, but don't tell MrFlick I said that... :P)

  2. as_adverb() is a great suggestion! This would allow users to take any function -> function mapping and begin using it in a consistent fashion.

moodymudskipper commented 5 years ago

Hi Artem,

using in the example above is not an adverb anymore, but an adverb factory, or to be consistent with Hadley's terms in Advanced R, a function operator factory (interestingly I noticed that he avoids the word "adverb" in the entirety of his chapter on what he prefers to call "function operators").

I was thinking about using as a special case first, but it could be made general.

I thought about 4 different APIs, 2 for adverbs, and 2 for adverb factories (the drafts of the functions are at the bottom) :

1) My original idea, tag adverbs would be different from usual adverbs because they only have one argument, other arguments must be pushed to the back of the function call

safely1 <- as_adverb1(safely)

# original behavior doesn't work
safely1(log, otherwise = NA_real_)(100, 10)
#> Error in safely1(log, otherwise = NA_real_): unused argument (otherwise = NA)

# New behavior is to push the arguments at the back of the function call
safely1(log)(100, 10, otherwise = NA_real_)
#> $result
#> [1] 2
#> 
#> $error
#> NULL

# `$` is a simple shortcut
safely1$log(100, 10, otherwise = NA_real_)
#> $result
#> [1] 2
#> 
#> $error
#> NULL

2) Don't make our adverbs special, let $.adverb do the hard work of pushing parameters to the back, the adverbs themselves just take a class. That's what I hinted to at the end of my answer above but I wasn't clear.

safely2 <- as_adverb2(safely) # only adds the class adverb

# same original behavior
safely2(log, otherwise = NA_real_)(100, 10)
#> $result
#> [1] 2
#> 
#> $error
#> NULL

# use `$` and push arguments to back of the function call
safely2$log(100, 10, otherwise = NA_real_)
#> Error in as_tag(safely2): could not find function "as_tag"

# this won't work
safely2(log)(100, 10, otherwise = NA_real_)
#> $result
#> NULL
#> 
#> $error
#> <simpleError in .f(...): unused argument (otherwise = NA)>

3) Building adverb factories instead of adverbs

safely3 when used in its standard form produces an adverb taking a single argument, this adverb in turn can use $. We also build a dollar method to keep our previous behavior valid.

safely3 <- as_adverb_factory(safely)

# original behavior doesn't work
safely3(log, otherwise = NA_real_)(100, 10)
#> Error in (function (.f, otherwise = NULL, quiet = TRUE) : unused argument (10)

# what we do now is create a one argument adverb from the initial adverb and extra arguments
# then `$.adverb1` takes it from there
safely3(otherwise = NA_real_)$log(100, 10)
#> $result
#> [1] 2
#> 
#> $error
#> NULL

# and we can build a dollar method so we spare brackets when not using arguments, magrittr style
# equivalent to safely3()$log(100, 10)
safely3$log(100, 10)
#> $result
#> [1] 2
#> 
#> $error
#> NULL

# or even allow pushing arguments to the right 
safely3$log(100, 10, otherwise = NA_real_)
#> $result
#> [1] 2
#> 
#> $error
#> NULL

4) essentially same as 3, but we ditch the last behavior

Maybe two interfaces is confusing ? And the first option in 3) prevents argument conflicts.

We just implement a tweak so we can skip brackets if we use only the first argument of our adverb factory.

safely4 <- as_adverb_factory2(safely)

# original behavior doesn't work
safely4(log, otherwise = NA_real_)(100, 10)
#> Error in (function (.f, otherwise = NULL, quiet = TRUE) : unused argument (10)

# what we do now is create a one argument adverb from the initial adverb and extra arguments
# then `$.adverb1` takes it from there
safely4(otherwise = NA_real_)$log(100, 10)
#> $result
#> [1] 2
#> 
#> $error
#> NULL

# This now fails
safely4$log(100, 10, otherwise = NA_real_)
#> $result
#> NULL
#> 
#> $error
#> <simpleError in .f(...): unused argument (otherwise = NA)>

# but if no parameter is given we can skip brackets
safely4$log(100, 10)
#> $result
#> [1] 2
#> 
#> $error
#> NULL

In principle either adverb factory API can work with either adverb API. Using 4 and 1 is the stricter approach, using 3 and 2 the most flexible and general, but potentially confusing


as_adverbx and as_adverb_factoryx are themselves single argument adverbs, so if we give them the appropriate classs we can use $ on them as well


Hadley might avoid the word adverb in his book because it's a can of worm, a lot of adverbs are not named as verbs and a lot of non adverbs are named as such, and function operators cannot all be nicely mapped to an english adverbs. I started to think about adjectives of past participles, it seems that they look nicer and come to mind easier, the counterpart to safely above would be safe. The counterpart of Negate would be negated.


The functions that reproduce the behavior above.

library(purrr)
#> Warning: package 'purrr' was built under R version 3.5.3

`$.adverb1` <- function (e1, e2) {
  mc <- match.call()
  mc[[1]] <- mc[[2]]
  mc[[2]] <- NULL
  names(mc) <- NULL
  eval.parent(mc)
}

formals2 <- tags:::formals2 # formals adapted for generics and primitives

as_adverb1 <- function(adv){
  res <- as.function(alist(f=,
    as.function(c(formals2(f), formals2(adv)[-1], quote({
      #browser()
      mc <- match.call()
      # make a copy of the call, and build call to apply adv on f with adv's args
      mc_adv <- mc
      mc_adv[[1]] <- quote(adv)
      mc_adv[names(formals2(f))] <- NULL
      mc_adv <- as.call(append(as.list(mc_adv),alist(f),after = 1))
      # make call to apply adv(f, ...) on f's args
      mc[names(formals2(adv)[-1])] <- NULL
      mc[[1]] <- mc_adv
      eval(mc)
    })))
  ))
  class(res) <- union("adverb1",class(res))
  res
}

as_adverb2 <- function(f) {
  class(f) <- union("adverb2",class(f))
  f
}

`$.adverb2` <- function (e1, e2) {
  mc <- match.call()
  mc[[1]] <- bquote(as_tag(.(mc[[2]])))
  mc[[2]] <- NULL
  names(mc) <- NULL
  eval.parent(mc)
}

as_adverb_factory <- function(adv){
  res <- as.function(c(formals2(adv)[-1],quote({
    #browser()
    # turn f(...) into purrr::partial(adv, ...)
    mc <- match.call()
    mc <- as.call(c(quote(purrr::partial), adv, as.list(mc)[-1]))
    res <- eval.parent(mc)
    class(res) <- union("adverb1", class(res))
    res
  })))
  class(res) <- union("adverb_factory", class(res))
  res
}

`$.adverb_factory` <- function (e1, e2) {
  as.function(c(formals2(e2), formals2(e1), quote({
    #browser()
    mc1 <- match.call()
    mc2 <- mc1
    mc1[names(formals2(e2))] <- NULL
    mc1[[1]] <- e1
    mc2[names(formals2(e1))] <- NULL
    mc2[[1]] <- bquote(.(mc1)(.(as.symbol(e2))))
    eval(mc2)
  })))
}

as_adverb_factory2 <- function(adv){
  res <- as.function(c(formals2(adv)[-1],quote({
    #browser()
    # turn f(...) into purrr::partial(adv, ...)
    mc <- match.call()
    mc <- as.call(c(quote(purrr::partial), adv, as.list(mc)[-1]))
    res <- eval.parent(mc)
    class(res) <- union("adverb1", class(res))
    res
  })))
  class(res) <- union("adverb_factory2", class(res))
  res
}

`$.adverb_factory2` <- function (e1, e2) {
  #browser()
  mc <- match.call()[-1]
  mc[[1]] <- e1()
  eval(unname(mc))
}
moodymudskipper commented 5 years ago

Went with 3) and 2), testing new formalism introduced in https://github.com/moodymudskipper/tags/issues/19

ArtemSokolov commented 5 years ago

I imagine that argument collision would be quite rare, due to the fact that adverbs / adverb factories operate on an entirely different domain than the functions they modify. However, if collisions are expected, a possible workaround is to automatically rename some of the arguments.

Personally, I'm not a fan of using adjectives, because I think it reads a little unnatural. But I suppose this is one of those philosophical points that can be debated into the late hours over a pint.

moodymudskipper commented 5 years ago

Hi @ArtemSokolov !

I changed a LOT of things, it took a bit of work :). If you're still around I'll be happy to get your thought on this. See readme : https://github.com/moodymudskipper/tag

ArtemSokolov commented 5 years ago

Hi Antoine,

The update looks great! I really like the flexibility of argument placement, and your examples are easy to follow and really showcase the functionality. I only have a few very minor comments:

  1. Include a link to https://github.com/moodymudskipper/tags in its first mention: "See also the package tags which includes many tags defined with the help of this package."
  2. Consider including an example in “Formalism” section to use during definitions. For example, you can take the example in the first sentence of “Syntax” section some_tag(tag_arg)(fun)(fun_arg), and refer to individual elements as you define input function, manufactured function, etc.
  3. Typo: “Thes ecome directly from the package tags.”

Looking forward to begin using these in my code!

moodymudskipper commented 5 years ago

Hi Artem,

These are good points thank you. I'm happy to read that the examples are easy to follow, I was a bit worried that they wouldn't and it's hard to get feedback.

I'll be working on other projects for a week or two to attack this one back with a fresher mind, then :

Then hopefully I'll be comfortable releasing V1 and advertising it.

moodymudskipper commented 5 years ago

@ArtemSokolov I'm done with the 2nd big reshaping of the package :

find details there :

As I know you were especially interested in fs :

ArtemSokolov commented 5 years ago

(I think you tagged the wrong Artem there. ;))

composing works great! I like that partial specification preserves the first argument, unlike purrr:

f <- composing$sample(5,TRUE)$sqrt$mean(na.rm=TRUE)

## Equivalent purrr implementation requires explicit naming of size and replace,
##   because partial(sample, 5, TRUE) feeds 5 as the first argument to sample
g <- compose( partial(sample, size=5, replace=TRUE),
             sqrt, partial(mean, na.rm=TRUE), .dir="forward" )

## Validation
set.seed(1)
suppressing_warnings$f(-5:5)   # 1.366025

set.seed(1)
suppressing_warnings$g(-5:5)   # 1.366025
moodymudskipper commented 5 years ago

Cool!

Note that you can use those (or any tagged call) in pipe chains, magrittr will only insert the dot inside the last bracket pair, which is another advantage compared to standard adverbs.

I've been using some tags in my projects now and nothing broke in the last 10 days or so, but I only use a small subset of them . I hope to implement tag composition soon, https://github.com/moodymudskipper/tag/issues/22.

It failed to spark much interest so far unfortunately, so the testing process might take a while before having something to submit to CRAN.

ArtemSokolov commented 5 years ago

Too bad about the lack of interest. Hopefully it will pick up steam!

Since you're waiting to submit to CRAN, one minor suggestion is to add tag as a Remotes: dependency in tags (https://remotes.r-lib.org/articles/dependencies.html), allowing it to be automatically installed when a user does devtools::install_github( "moodymudskipper/tags" ).

moodymudskipper commented 5 years ago

Thanks @ArtemSokolov that's a neat idea. I'll get to it asap https://github.com/moodymudskipper/tags/issues/30 .