tidyverse / magrittr

Improve the readability of R code with the pipe
https://magrittr.tidyverse.org
Other
957 stars 157 forks source link

Pipe evaluates its arguments in wrong environment #194

Closed klmr closed 4 years ago

klmr commented 5 years ago

This is almost a feature rather than a bug: Using %>% without importing the whole magrittr package (e.g. via any of the packages that import it, such as dplyr) pulls other functions from magrittr into scope. MWE:

`%>%` = magrittr::`%>%`
`%<>%` = function (x, y) {
    message("%<>% was called")
    y
}

x = 1
x %<>% `+`(2)
# %<>% was called
# [1] 2

x %<>% `+`(2) %>% `*`(2); x
# ACTUAL:
# 6
# EXPECTED:
# %<>% was called
# 4

… in other words: inside a pipeline, magrittr’s `%<>%` is called rather than our own. This mostly leads to confusion for users of purrr and dplyr, who observe that using %<>% sometimes works and sometimes breaks, e.g. https://stackoverflow.com/q/55381321/1968.

Note that this behaviour doesn’t seem to extend to other magrittr functions (e.g. using x %>% subtract(5) will not call magrittr::subtract).

(This might also be related to #159 but it’s not obvious how. Possibly also related to #57?)

moodymudskipper commented 5 years ago

It's not a matter of environments, it's because x %<>% `+`(2) %>% `*`(2); x is (x %<>% `+`(2)) %>% `*`(2); x and the expressions are parsed, looking for acceptable pipe symbols.

The code actually looks for a function named %<>% , the names of the pipes are important, not their code. Indeed all pipes have the same code, and most of the time it's not executed.

debugonce(`%<>%`)
x %<>% `+`(2) %>% `*`(2) # not debugging
x %<>% `+`(2) # debugging

It's the same reason why you can't copy a pipe :

`%x>%` <- `%>%`
cars %x>% head()

Other pipes are just the same :

iris %$% Species # error
iris %$% Species %>% head # works
iris %>% head %$% Species # error

I have no idea why it was coded like that.

smbache commented 5 years ago

This was when we went from evaluating the pipe every step, rather than “compiling” the whole chain at once.

moodymudskipper commented 5 years ago

Is it about efficiency only ?

smbache commented 5 years ago

Hmm probably not. I’ve proposed other/further changes since which didn’t make it in (yet at least). I like the approach here https://github.com/tidyverse/magrittr/compare/the-white-page but it has the same “issue”, if you see it as such.

moodymudskipper commented 5 years ago

Say a user attaches dplyr and not magrittr, uses %$% in one operation (maybe thinking he's attached magrittr, or that dplyr exports %$%) and it works, then uses it in the next operation and it doesn't, He then proceeds to explore %$% but it doesn't exist. It won't be obvious to them that they would have to attach magrittr. So I think it's a legit issue, even though the solution, attaching magrittr, is trivial once you get it.

A dplyr user in principle also shouldn't have to know that %$% is a thing, so he might want to define it for himself, and then if he uses it in a pipe chain it won't behave as expected. That will be hard to debug.

Maybe is_pipe could test not only for the symbol validity, but if the object exists if it's a legit pipe ? In my fork pipes I give a class to pipes so is_pipe checks for class instead of symbol, maybe this approach would help.

smbache commented 5 years ago

For sure it’s a valid concern/issue. I won’t rule out that magrittr could/should perform such checks. On the other hand, how other packages decide to use and where the responsibility lies is not clear cut. dplyr makes no promises regarding other pipes; that special cases end up working anyway is of course confusing, and this should definitely be taken into account in the next iteration.

moodymudskipper commented 5 years ago

magrittr works as it is, then dplyr (and friends) chooses to export only its function %>% and some potential issues arise, so I get your point that it's not magrittr's responsability.

It could even be moved to usethis as usethis::use_pipe is what's creating these issues. It's easier to fix it here than there though.

That being said the issue hasn't been raised here before nor ever on SO as far as I know.

smbache commented 5 years ago

I think it is worth having the issue raised here in any case. Since @hadley is involved in most related pkgs, he might have preferences. I’m not against checking this explicitly In magrittr as such.

hadley commented 4 years ago

I doubt it's worth the cost of fixing this issue.