Open Lincoln-Hannah opened 2 years ago
I'm coming around to this being a good idea. It would certainly cut down on lots of typing and it definitely seems to be true that 90% of commands are @rtransform
.
Maybe @nalimilan can chime in and give their thoughts. Because this would be a pretty non-standard syntax transformation.
How would it combine with grouping/ungrouping data frames in the process (I think it would be OK, but I want to make sure)
Technically, would it be possible to support passing macro calls like @rsubset
or @orderby
inside @rtransform df begin... end
, so that we don't need a new macro like @chainWithrTransform
?
That would make https://github.com/JuliaData/DataFramesMeta.jl/pull/376/ a bit less ad-hoc.
@bkamins I think it would only apply to row-wise operations, so grouping would in general be ignored.
@nalimilan I'm not sure how that would work, are you saying something like
@chain_r_transform df begin
:y = :x * 2
@rsubset ...
@orderby ...
end
or something else?
I also disagree what #376 (@when
) is that ad-hoc. It honestly seems like one of the simpler ways to give the complication functionality that imitates Stata's if
.
@bkamins Maybe a better way to think of it:
:ColumnName = ...
to @rtransform ColumnName = ...
.@chain
block.So this
@chainWithrTransform DataFrame( A = 1:10 ) begin
:B = mod( :A, 3 )
:E = :B * 2
@rsubset :B == 1
:F = :B * 2
:G = :F + 1
@orderby :A
:H = :G * 2
end
becomes
@chain DataFrame( A = 1:10 ) begin
@rtransform :B = mod( :A, 3 )
@rtransform :E = :B * 2
@rsubset :B == 1
@rtransform :F = :B * 2
@rtransform :G = :F + 1
@orderby :A
@rtransform :H = :G * 2
end
Its just a bit of Syntax Sugar
As @pdeffebach says, 90% of lines are @rtransform.
It allows you to not have to keep writing it.
So the first assignment operation present in the block drops grouping.
Lines of the form :columnName = ...
are changed to @rtransform :columnName= ...
But not if they are within a sub-block (.e.g. @by
or @transform
)
@chainWithrTransform begin
DataFrame(A=1:4)
:B = mod(:A,2)
@by :B begin
:A = mean(:A)
end
:C = :B / :A
@transform :sumA = sum(:A)
end
becomes
@chain begin
DataFrame(A=1:4)
@rtransform :B = mod(:A,2)
@by :B begin
:A = mean(:A) #unchanged since it is within the @by block
end
@rtransform :C = :B / :A
@transform :sumA = sum(:A) #unchanged since it is within the @transform block
end
So the first assignment operation present in the block drops grouping.
Yeah. It would drop grouping.
Are you guys still interested in implementing this? Just realised again how many times I write @rtransform
A macro similar to @chain but treats any line that isn't another macro as being within an @rtransform @astable block. So what would currently be written as:
Could be written as: