Closed renkun-ken closed 10 years ago
A draft syntax is like
x %>>% (~ symbol)
x %>>% (~ f(.) ~ symbol)
x %>>% (~ x ~ f(x) ~ symbol)
It can be best described by Start with ~
for side effect and end with a symbol for assignment.
An example is
mtcars %>>%
subset(mpg <= mean(mpg)) %>>%
(~ smtcars) %>>%
(~ dim(.) ~ dim_mtcars) %>>%
subset(select = c(mpg, wt, qsec)) %>>%
lm(formula = mpg ~ .) %>>%
summary %>>%
(~ summ) %>>%
(coefficients)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.0183914 5.1411954 3.310201 0.0047583237
wt -2.9781345 0.6044032 -4.927397 0.0001823504
qsec 0.6033051 0.3053237 1.975953 0.0668509086
Inspect the environment after evaluating the code above.
> ls.str()
dim_mtcars : int [1:2] 18 11
smtcars : 'data.frame': 18 obs. of 11 variables:
$ mpg : num 18.7 18.1 14.3 19.2 17.8 16.4 17.3 15.2 10.4 10.4 ...
$ cyl : num 8 6 8 6 6 8 8 8 8 8 ...
$ disp: num 360 225 360 168 168 ...
$ hp : num 175 105 245 123 123 180 180 180 205 215 ...
$ drat: num 3.15 2.76 3.21 3.92 3.92 3.07 3.07 3.07 2.93 3 ...
$ wt : num 3.44 3.46 3.57 3.44 3.44 ...
$ qsec: num 17 20.2 15.8 18.3 18.9 ...
$ vs : num 0 1 0 1 1 0 0 0 0 0 ...
$ am : num 0 0 0 0 0 0 0 0 0 0 ...
$ gear: num 3 3 3 4 4 3 3 3 3 3 ...
$ carb: num 2 1 4 4 4 3 3 3 4 4 ...
summ : List of 11
$ call : language lm(formula = mpg ~ ., data = .)
$ terms :Classes 'terms', 'formula' length 3 mpg ~ wt + qsec
$ residuals : Named num [1:18] 1.658 -0.813 -1.643 1.386 -0.376 ...
$ coefficients : num [1:3, 1:4] 17.018 -2.978 0.603 5.141 0.604 ...
$ aliased : Named logi [1:3] FALSE FALSE FALSE
$ sigma : num 1.79
$ df : int [1:3] 3 15 3
$ r.squared : num 0.623
$ adj.r.squared: num 0.573
$ fstatistic : Named num [1:3] 12.4 2 15
$ cov.unscaled : num [1:3, 1:3] 8.217 -0.178 -0.437 -0.178 0.114 ...
Given all syntax with (~ ...)
, operator ~
can be viewed in this context to be branching operator, which indicates that the following expression will be a side effect. It can either branch the left-hand side value to an expression (side-effect evaluation), or branch it to a symbol (assignment). After all, there's no point to evaluate a symbol for side effect (no side effect at all). Therefore this syntax seems not to create additional confusion or work at the expense of possible actions allowed in cases without this feature.
Consider the =
syntax suggested by @yanlinlin82.
See https://github.com/renkun-ken/pipeR/issues/38.
The following code adopts the =
syntax.
mtcars %>>%
subset(mpg <= mean(mpg)) %>>%
(~ smtcars) %>>% # side-effect assign
(~ dim_mtcars = dim(.)) %>>% # side-effect assign
subset(select = c(mpg, wt, qsec)) %>>%
lm(formula = mpg ~ .) %>>%
(sum_lm = summary(.)) %>>% # eval and assign
(coefficients)
Definitely prefer this. I think this is much clearer, intuitive, and more readable.
Think so too. Thanks @yanlinlin82 for the great suggestion. I'll implement it at branch feature/assign
soon and see how it works.
The latest commit at feature/assign
uses symbolic call to perform the assignment, which allows the following usage:
> z <- list()
> 1:10 %>>% (~ z$a = length(.)) %>>% mean
[1] 5.5
> z
$a
[1] 10
That is, the assignment no longer calls assign()
but builds a symbolic call to perform the assignment, which does not require the expression on lhs of =
be a symbol and allows the usage like names(a) = ...
.
That is more powerful!
In v0.5, <-
and ->
will no longer be interpreted as lambda expression and are allowed to perform assignment in a pipeline, which makes the code even more readable in some cases.
It's a common demand that an intermediate result be assigned to a symbol in the current environment (often global environment) for further use. This clearly is one type of side effect that the current environment is changed.
Currently, there's no easy syntax that supports the assignment operation but manually call
assign()
likeThe code works but it is only easy for global environment or some named environment. For local environment, it does not work with
parent.frame()
.Consider a syntax that derives from side-effect syntax that performs assignment operation like this.