Open ArtemSokolov opened 5 years ago
These are great comments thanks!
Here's my reaction to them, this will take some time for me to digest this all, and I'll need to look back at this with a fresh mind.
Start on clean foundation
If wer'e going to make adverbs easier to understand, consistency must come first, I'll ditch functions that are not adverbs, and those that have hacky code or no value, so good bye to :
..at
: I actually like it a lot as it enables combining summarize
, sumarize_at
, spread
and a multi spread
, so it's a generalized aggregation function with readable syntax, but the code is a mess and it just work on summarize
and partially on mutate
, needs to get back to the fridge...ip
: for the reasons you give, and I'd rather use dotdot to do things like class(x) := union("new_class", ..)
which are explicit about growing an object and still avoid repeating the input...w
: not an adverb, but see below, building on your suggestion..p
: not an adverb, and gives silently a class to the input that it won't take away, which is a bit sneaky and awkward, though that's what glue does. pipeR::pipe
does something similar and has to deal with the same awkward consequence...dt
: is not an adverb, it was cool though, but that'd be for another package...incr
/ ..chr
: not adverbs, not useful.Extending ..w
I am not convinced by using$iris$t.test( Sepal.Length, Sepal.Width)
as iris
is not very visible and using
is not a proper adverb, however using
could be a meta-adverb so that using(iris)
(no dollar shorthand) would be an adverb.
So we could have
using(iris)$t.test(Sepal.Length, Sepal.Width)
equivalent to:
with(iris, t.test(Sepal.Length, Sepal.Width))
I like to subset with [
though and I like the idea that [
implies with
as in data.table, so I might give the output of using
a class "using"
(or a better name if it comes up) enabling [.using
so I can do :
some_vec <- setNames(1:5,letters[1:5])
using(some_vec)[. > b]
# [1] 3 4 5
# rather than :
some_vec[some_vec > some_vec["b"]]
Re using .
in lambda functions
You are right about the conflict but I don't think it's different than doing df %>% mutate(col = map(col, ~fun(. , arg))
nor that it is more confusing. I could use the lhs of the formula but then I depart from rlang and I'm not sure it's worth it :
iris %>% ..lbd$mutate( Petal.Width = x ~nrow(.)* x )
better examples
examples for ..fn
will be changed using ave
or combn
. And package will be introduced with a purrr
adverb such as safely
.
support adverb chaining
And remove reference to adverb composition, which is confusing.
I meant that to make adv1$adv2$fun
work I need (adv1$adv2)$fun
to do the same as adv1(adv2$fun)
, so adv1$adv2
composes adverbs instead of the standard behavior.
Note that it means that one can't use adverbs on adverbs unless they do some class surgery, but I don't think it would be common enough to be a problem.
Formalism and naming conventions
I agree with you, good formalism will go a long way.
Sadly I feel like it's the hardest part.
I agree to use actual adverb names rather than a ..
prefix but I think additional guidance is necessary to choose names, trying to name them like adverbs, even with a poetic license, is not enough in my opinion.
Colin Fay likes with_*
, I like it too, but he still can't manage to avoid awkward names like talky
. using_*
is also a good format.
Or we really push it and suffix all with _ly
: safe_ly
, group_ly
, or Ly
(with capital L
) : safeLy
, groupLy
. A Negate
equivalent would be negLy
/ neg_ly
/, we don't need to end up with awkward long names.
purrr defines :
"Adverbs modify the action of a function; taking a function as input and returning a function with modified action as output."
But it doesn't say why we need that and what makes a good or a poor adverb, and I'm still figuring this out myself :).
Maybe tags (package) and "tag"
(class) can just become adverbs and "adverb"
once it's cleaned up.
Additional thoughts on $.tag
/ $.adverb
So far the function has to have exactly one parameter to work with $.tag
, but $.tag
could automatically forward all additional parameters to the output function. so we can just do safeLy <- as_adverb(purrr::safely)
and it'd be ready to use with $
notation.
Code of modified functions
Should be more readable
Modified functions should keep original parameters as much as possible rather than having only ...
like in purrr.
Hi Antoine,
Here are some more thoughts about your suggestions:
using(iris)$t.test(Sepal.Length, Sepal.Width)
looking cleaner. This also provides a potential option for adding arguments to using
. I think you originally mentioned that adverbs generally don't receive arguments directly, but I'm curious if there's a use case where it might be advantageous. For example, quietly
could have separate flags (all TRUE
by default) for suppressing warnings, errors and messages:quietly( err=FALSE )$load_csv( "myfile.csv" ) ## Display errors only
Personally I feel that _ly
or Ly
is a better alternative than with_*
or using_*
. To me, it feels more natural. (And there is less typing, but don't tell MrFlick I said that... :P)
as_adverb()
is a great suggestion! This would allow users to take any function -> function mapping and begin using it in a consistent fashion.
Hi Artem,
using
in the example above is not an adverb anymore, but an adverb factory, or to be consistent with Hadley's terms in Advanced R, a function operator factory (interestingly I noticed that he avoids the word "adverb" in the entirety of his chapter on what he prefers to call "function operators").
I was thinking about using
as a special case first, but it could be made general.
I thought about 4 different APIs, 2 for adverbs, and 2 for adverb factories (the drafts of the functions are at the bottom) :
1) My original idea, tag adverbs would be different from usual adverbs because they only have one argument, other arguments must be pushed to the back of the function call
safely1 <- as_adverb1(safely)
# original behavior doesn't work
safely1(log, otherwise = NA_real_)(100, 10)
#> Error in safely1(log, otherwise = NA_real_): unused argument (otherwise = NA)
# New behavior is to push the arguments at the back of the function call
safely1(log)(100, 10, otherwise = NA_real_)
#> $result
#> [1] 2
#>
#> $error
#> NULL
# `$` is a simple shortcut
safely1$log(100, 10, otherwise = NA_real_)
#> $result
#> [1] 2
#>
#> $error
#> NULL
2) Don't make our adverbs special, let $.adverb
do the hard work of pushing parameters to the back, the adverbs themselves just take a class. That's what I hinted to at the end of my answer above but I wasn't clear.
safely2 <- as_adverb2(safely) # only adds the class adverb
# same original behavior
safely2(log, otherwise = NA_real_)(100, 10)
#> $result
#> [1] 2
#>
#> $error
#> NULL
# use `$` and push arguments to back of the function call
safely2$log(100, 10, otherwise = NA_real_)
#> Error in as_tag(safely2): could not find function "as_tag"
# this won't work
safely2(log)(100, 10, otherwise = NA_real_)
#> $result
#> NULL
#>
#> $error
#> <simpleError in .f(...): unused argument (otherwise = NA)>
3) Building adverb factories instead of adverbs
safely3
when used in its standard form produces an adverb taking a single argument, this adverb in turn can use $
.
We also build a dollar method to keep our previous behavior valid.
safely3 <- as_adverb_factory(safely)
# original behavior doesn't work
safely3(log, otherwise = NA_real_)(100, 10)
#> Error in (function (.f, otherwise = NULL, quiet = TRUE) : unused argument (10)
# what we do now is create a one argument adverb from the initial adverb and extra arguments
# then `$.adverb1` takes it from there
safely3(otherwise = NA_real_)$log(100, 10)
#> $result
#> [1] 2
#>
#> $error
#> NULL
# and we can build a dollar method so we spare brackets when not using arguments, magrittr style
# equivalent to safely3()$log(100, 10)
safely3$log(100, 10)
#> $result
#> [1] 2
#>
#> $error
#> NULL
# or even allow pushing arguments to the right
safely3$log(100, 10, otherwise = NA_real_)
#> $result
#> [1] 2
#>
#> $error
#> NULL
4) essentially same as 3, but we ditch the last behavior
Maybe two interfaces is confusing ? And the first option in 3) prevents argument conflicts.
We just implement a tweak so we can skip brackets if we use only the first argument of our adverb factory.
safely4 <- as_adverb_factory2(safely)
# original behavior doesn't work
safely4(log, otherwise = NA_real_)(100, 10)
#> Error in (function (.f, otherwise = NULL, quiet = TRUE) : unused argument (10)
# what we do now is create a one argument adverb from the initial adverb and extra arguments
# then `$.adverb1` takes it from there
safely4(otherwise = NA_real_)$log(100, 10)
#> $result
#> [1] 2
#>
#> $error
#> NULL
# This now fails
safely4$log(100, 10, otherwise = NA_real_)
#> $result
#> NULL
#>
#> $error
#> <simpleError in .f(...): unused argument (otherwise = NA)>
# but if no parameter is given we can skip brackets
safely4$log(100, 10)
#> $result
#> [1] 2
#>
#> $error
#> NULL
In principle either adverb factory API can work with either adverb API. Using 4 and 1 is the stricter approach, using 3 and 2 the most flexible and general, but potentially confusing
as_adverbx
and as_adverb_factoryx
are themselves single argument adverbs, so if we give them the appropriate classs we can use $
on them as well
Hadley might avoid the word adverb in his book because it's a can of worm, a lot of adverbs are not named as verbs and a lot of non adverbs are named as such, and function operators cannot all be nicely mapped to an english adverbs. I started to think about adjectives of past participles, it seems that they look nicer and come to mind easier, the counterpart to safely
above would be safe
. The counterpart of Negate
would be negated
.
The functions that reproduce the behavior above.
library(purrr)
#> Warning: package 'purrr' was built under R version 3.5.3
`$.adverb1` <- function (e1, e2) {
mc <- match.call()
mc[[1]] <- mc[[2]]
mc[[2]] <- NULL
names(mc) <- NULL
eval.parent(mc)
}
formals2 <- tags:::formals2 # formals adapted for generics and primitives
as_adverb1 <- function(adv){
res <- as.function(alist(f=,
as.function(c(formals2(f), formals2(adv)[-1], quote({
#browser()
mc <- match.call()
# make a copy of the call, and build call to apply adv on f with adv's args
mc_adv <- mc
mc_adv[[1]] <- quote(adv)
mc_adv[names(formals2(f))] <- NULL
mc_adv <- as.call(append(as.list(mc_adv),alist(f),after = 1))
# make call to apply adv(f, ...) on f's args
mc[names(formals2(adv)[-1])] <- NULL
mc[[1]] <- mc_adv
eval(mc)
})))
))
class(res) <- union("adverb1",class(res))
res
}
as_adverb2 <- function(f) {
class(f) <- union("adverb2",class(f))
f
}
`$.adverb2` <- function (e1, e2) {
mc <- match.call()
mc[[1]] <- bquote(as_tag(.(mc[[2]])))
mc[[2]] <- NULL
names(mc) <- NULL
eval.parent(mc)
}
as_adverb_factory <- function(adv){
res <- as.function(c(formals2(adv)[-1],quote({
#browser()
# turn f(...) into purrr::partial(adv, ...)
mc <- match.call()
mc <- as.call(c(quote(purrr::partial), adv, as.list(mc)[-1]))
res <- eval.parent(mc)
class(res) <- union("adverb1", class(res))
res
})))
class(res) <- union("adverb_factory", class(res))
res
}
`$.adverb_factory` <- function (e1, e2) {
as.function(c(formals2(e2), formals2(e1), quote({
#browser()
mc1 <- match.call()
mc2 <- mc1
mc1[names(formals2(e2))] <- NULL
mc1[[1]] <- e1
mc2[names(formals2(e1))] <- NULL
mc2[[1]] <- bquote(.(mc1)(.(as.symbol(e2))))
eval(mc2)
})))
}
as_adverb_factory2 <- function(adv){
res <- as.function(c(formals2(adv)[-1],quote({
#browser()
# turn f(...) into purrr::partial(adv, ...)
mc <- match.call()
mc <- as.call(c(quote(purrr::partial), adv, as.list(mc)[-1]))
res <- eval.parent(mc)
class(res) <- union("adverb1", class(res))
res
})))
class(res) <- union("adverb_factory2", class(res))
res
}
`$.adverb_factory2` <- function (e1, e2) {
#browser()
mc <- match.call()[-1]
mc[[1]] <- e1()
eval(unname(mc))
}
Went with 3) and 2), testing new formalism introduced in https://github.com/moodymudskipper/tags/issues/19
I imagine that argument collision would be quite rare, due to the fact that adverbs / adverb factories operate on an entirely different domain than the functions they modify. However, if collisions are expected, a possible workaround is to automatically rename some of the arguments.
Personally, I'm not a fan of using adjectives, because I think it reads a little unnatural. But I suppose this is one of those philosophical points that can be debated into the late hours over a pint.
Hi @ArtemSokolov !
I changed a LOT of things, it took a bit of work :). If you're still around I'll be happy to get your thought on this. See readme : https://github.com/moodymudskipper/tag
tag
and contains the building blocks, no tag.tags
contains a bunch of tags. I show case all the changes in its own readme : https://github.com/moodymudskipper/tags***ing
tag
allows very easy construction of new tagsfs
is not yet coded as a proper tag. I think it can be, with a slight change in the API, but I haven't got around to it.Hi Antoine,
The update looks great! I really like the flexibility of argument placement, and your examples are easy to follow and really showcase the functionality. I only have a few very minor comments:
some_tag(tag_arg)(fun)(fun_arg)
, and refer to individual elements as you define input function, manufactured function, etc.Looking forward to begin using these in my code!
Hi Artem,
These are good points thank you. I'm happy to read that the examples are easy to follow, I was a bit worried that they wouldn't and it's hard to get feedback.
I'll be working on other projects for a week or two to attack this one back with a fresher mind, then :
tag
and unit tests for tags
fs
Then hopefully I'll be comfortable releasing V1 and advertising it.
@ArtemSokolov I'm done with the 2nd big reshaping of the package :
find details there :
As I know you were especially interested in fs
:
composing
to be more consistent as I finally decided to follow a stricter naming convention, I figured users can make their own aliases if needed.composing$head$dim()
the first $
is $.tag
, the following are either $.tag_functional_sequence
after a function name or $.tag_fseq
after parentheses.deparse
/ parse
/ gsub
horror that I managed to clean up, using call substitution etc(I think you tagged the wrong Artem there. ;))
composing
works great! I like that partial specification preserves the first argument, unlike purrr
:
f <- composing$sample(5,TRUE)$sqrt$mean(na.rm=TRUE)
## Equivalent purrr implementation requires explicit naming of size and replace,
## because partial(sample, 5, TRUE) feeds 5 as the first argument to sample
g <- compose( partial(sample, size=5, replace=TRUE),
sqrt, partial(mean, na.rm=TRUE), .dir="forward" )
## Validation
set.seed(1)
suppressing_warnings$f(-5:5) # 1.366025
set.seed(1)
suppressing_warnings$g(-5:5) # 1.366025
Cool!
Note that you can use those (or any tagged call) in pipe chains, magrittr will only insert the dot inside the last bracket pair, which is another advantage compared to standard adverbs.
I've been using some tags in my projects now and nothing broke in the last 10 days or so, but I only use a small subset of them . I hope to implement tag composition soon, https://github.com/moodymudskipper/tag/issues/22.
It failed to spark much interest so far unfortunately, so the testing process might take a while before having something to submit to CRAN.
Too bad about the lack of interest. Hopefully it will pick up steam!
Since you're waiting to submit to CRAN, one minor suggestion is to add tag
as a Remotes:
dependency in tags
(https://remotes.r-lib.org/articles/dependencies.html), allowing it to be automatically installed when a user does devtools::install_github( "moodymudskipper/tags" )
.
Thanks @ArtemSokolov that's a neat idea. I'll get to it asap https://github.com/moodymudskipper/tags/issues/30 .
Thank you so much for taking the time to put this together. I think this is a great resource, and I’m very excited to see where it will go. Below I have some specific comments that I put together while going through your write-up.
I completely agree with you that adverbs are underutilized, but I think the main reason is the lack of formalism. Furthermore, there is no package centralizing adverb usage (the way, for example,
rlang
centralizes non-standard evaluation). It's all scattered around various packages, some of which you noted in your README. I think you have the beginnings of something that can become a central adverb resource by 1) importing relevant adverbs from other packages and 2) defining your own set of adverbs, as you've done here.I would bet that most developers who become comfortable with functions being first class objects probably use adverbs in some capacity without realizing it. You can think about starting your README with a representative example showing a code pattern that developers probably already use (e.g.,
purrr::safely
) and how it can be accomplished more cleanly using your tag system.Naming conventions
I agree with Nathan’s comment that names beginning with
.
tend to have an “internal variable” connotation, but none of the proposed alternatives seem very aesthetic to me. One idea is to write adverb tags as… well, adverbs. For example, instead of..bang$transform(head(cars,2), !!w := !!v / !!sym(u), !!!x)
we can write
tidily$transform(head(cars,2), !!w := !!v / !!sym(u), !!!x)
Similarly, instead of
..lbd$mutate(head(iris,2), Petal.Width = ~1000*(.), Species = ~toupper(.))
we could do
succinctly$mutate(head(iris,2), Petal.Width = ~1000*(.), Species = ~toupper(.))
I think it reads quite naturally and has the advantage of being in line with adverbs already in the wild (e.g.,
purrr::safely
). You also have the option of getting creative (grouply
in place of..grp
?) to avoid name collisions.Using
.
inside lambda functionsThere is a potential conflict in resolving
.
inside your..lbd
. Whenmutate
is used in a pipe,.
is occasionally used to reference the data frame itself. For example, consideriris %>% mutate( Petal.Width = nrow(.)*Petal.Width )
Following your
..lbd
example, we would haveiris %>% ..lbd$mutate( Petal.Width = ~nrow(.)*(.) )
which creates conflicting interpretations of
.
.Extending the
..w
tagI think that you can extend the
..w
tag to be a fully fledged adverb by chaining it with its argument. For example,with( iris, t.test( Sepal.Length, Sepal.Width ) )
might look like
..w$iris$t.test( Sepal.Length, Sepal.Width )
or, if you decide to change the names to adverbs,
using$iris$t.test( Sepal.Length, Sepal.Width)
Minor comments
..at
usage feels a bit messy. I don’t have a good idea of how to make it better (yet), but relative to other adverbs, it doesn’t seem to flow as well.Assignment in place via
..ip
is somewhat counter intuitive because it breaks copy-on-modify semantics. I’m worried that it might encourage developers to write code that doesn’t play nicely with the functional paradigm. Additionally, this adverb will need extensive testing to ensure it behaves correctly in nested functions, environments, pipes, etc.Consider creating a different example for
..fn
, as it is not clear why..fn$sapply(iris, ~class(.))
would be preferred oversummarize_all( iris, class )
...fs
is the best thing since sliced bread, as it combines the functionality ofpurrr::partial
andpurrr::compose
. It’s all I’ve ever needed in life. ;)In the section discussing adding new adverbs, it’s unclear what you mean by “composition” when referring to the “adverb” class. Consider providing an example.
I think that chaining of adverbs
..View$..grp$summarize_all(...)
looks cleaner than nested calling..View(..grp$summarize_all)(...)
, since one of the goals of using adverb tags in the first place is to deal with a “parenthesis overload”.