renkun-ken / pipeR

Multi-Paradigm Pipeline Implementation
Other
169 stars 39 forks source link

Naming #12

Closed smbache closed 10 years ago

smbache commented 10 years ago

If you indeed think there is a need for different pipe implementations, I think you should choose names not conflicting with magrittr. There is already quite some usage of %>% around, and such naming conflict is a potential source of confusion and irritation. If the two should co-exist I think they should be easily differentiated...

renkun-ken commented 10 years ago

Thanks for your advice! It's sad that the two packages currently are not compatible with each other. I thought about this when I create this package. I guess that it may be more meaningful to let users (for example, dplyr users) use this package at a minimum cost without having to change many operators they use since dplyr imports magrittr's %>% operator but pipeR's %>% operator is fully compatible with dplyr on first-argument piping functionality (which is mostly the case). If the users want to pipe to other places, they need to use %>>% instead.

The principle of this package is quite simple: The operators do not guess which mechanism the user want; the user need to be aware of it. Each operator should do as little as possible to avoid potential cognitive problems. I find that if one operator actually does more than one thing (implementing two piping mechanisms), it can result in readability issues and very subtle bugs.

renkun-ken commented 10 years ago

I indeed think there is a need for different pipe implementations and that is exactly why I create this package. The reason is that it's not only different implementations, but there are indeed different piping mechanisms, as I write in README. I identify three mechanisms: first-argument piping (mostly the case), free piping using a ., and lambda piping (for those who want to control the name of the piping variable). These are in essence very different from each other, which leads to the following problems:

  1. Do we need them all?
  2. Do we combine them all in one operator?

I reviewed the code of magrittr and it does a good job combining them together. However, as I use piping heavily, I find that it may not be a very good idea to combine the functionality, which does not result from technical issues, but from cognitive problems. magrittr tries to guess which mechanism I would like to use by reading the piping code I write. But from the perspective of a user, some inconsistencies happen. For example: I expect the following code should yield c(1,2,3,2,3,4):

c(1,2,3) %>%
  c(., .+1)

but magrittr does not work because it does not pipe the numbers to sub-level . symbols. In other words, it is not only that magrittr needs to guess what I'm doing, but also that I also need to carefully guess what it is doing.

In pipeR, each operator does the simplest work, which the users do not have to guess. If you really want to free piping with ., you just need %>>% and you know exactly what the operator tries to do: evaluate the later expression with . being assigned the former value.

c(1,2,3) %>>%
  c(.,.+1)

which yields exactly what the user expects.

smbache commented 10 years ago

Yeah, I get your intensions and respect the viewpoint; although I don't share it (at least now; in fact magrittr started like along the lines of piper with more than one operator, but has evolved and changed due to demand and suggestions).

In particular because dplyr imports magrittr's version of %>% I whink it is confusing to design an alternative with the same name; also for the sake of piper users: you want to be able to see off the bat which pipe is being used. dplyr users are accustomed to magrittr's, and should know when that is not the one being used. At least that is what I think.

The question: to nest or not to nest "." has been up a few times; and in the current "dev" branch of magrittr it is possible, but it treats nested dots a little differently than non-nested dots. Your example will work in the dev version.

I don't think "guessing" is the right word: magrittr follows strict semantics and knowing them is not much different than knowing that the generic summary will call e.g. summary.lm or summary.data.frame, or that aggregate works differently for data.frames and formulas. It is very R like to abstract things away and have fewer functions/operators for the user to know. But again, it is subjective what is preferred.But if you made a package that fundamentally changed how + generally works, would be a stretch, which is why I also think the name-clash in piper is unfortunate.

renkun-ken commented 10 years ago

The operator both packages use is probably the most natural choice. Just to make this package a substitute is fine, and users will choose by themselves due to demand just like choosing between RJSONIO and jsonlite, both of which implement fromJSON and toJSON.

smbache commented 10 years ago

Well, %|%, %=>%, %|>% are all equally natural, if not more, since they appear other places. In fact only magrittr (of what I know) has this notation or anything that resembles.

But now you know my view.

renkun-ken commented 10 years ago

Thanks anyway. How about putting this issue on hold and reopen it when there is really a problem on this?

smbache commented 10 years ago

That's really up to you :-) your package is still "young" and can be changed without too much hassle; it will be harder later on. magrittr notation isn't likely to go anywhere, its too settled in. But I do think both packages have better chances of co-existence if they appear distinct, because they are. With identical names its more likely that users will decide and stick, leaving one out.

renkun-ken commented 10 years ago

Thanks for your advice and it is really a bit struggling to change an operator. I agree with @smbache and @rdinnager that it is good for the community and better allocation of resources to avoid potential competition. Therefore, I come up a solution for this since this "young" package still has chances to do it.

What about changing %>% to %>>%, %>>% to %:>% so that it is handy to type the symbols and easy to recognize? So the operators become

First-argument piping: %>>%
Free piping with dot: %:>%
Lambda piping with formula: %|>%

What do you think?

rdinnager commented 10 years ago

Works for me. It doesn't conflict with anything I am aware of. Cheers :ok_hand:

smbache commented 10 years ago

That would be way better. Maybe, you could consider changing only one of them so use %:>% for FA piping. You could also consider to override %>% temporarily doing something like

`%>%` <- function(lhs, rhs) 
{
    warning("Due to naming conflicts, the operator has changed name. Redirecting to magrittr:`%>%`")
    magrittr::`%>%`(lhs, rhs)
}

or whatever makes it work. This way you can slowly get rid of %>% and let the users know what is going on. Then in a few versions, remove it all together.