TidierOrg / TidierData.jl

Tidier data transformations in Julia, modeled after the dplyr/tidyr R packages.
MIT License
86 stars 7 forks source link

Support for multi-line `begin` and `end` blocks inside macros #88

Closed KadeG closed 8 months ago

KadeG commented 8 months ago

Currently to apply, for example, a series of @mutates you might use:

df2 = @chain df1 begin
   @mutate(
      col1 = 0,
      col2 = 0,
      col3 = 0)
end

This is generally clear and nice for wrangling data with many columns. In DataFramesMeta it could look like the above (plus : for column names), or:

df2 = @chain df1 begin
   @rtransform begin
      :col1 = 0
      :col2 = 0
      :col3 = 0
   end
end

Eliminating , and () management when nesting and doing other visually messy things is especially nice. It also allows for easier experimentation and debugging by commenting out expressions, and is visually tidier. Currently in Tidier this runs:

df2 = @chain df1 begin
   @mutate begin
      col1 = 0
   end
end

but the parser doesn't know what to make of this:

df2 = @chain df1 begin
   @mutate begin
      col1 = 0
      col2 = 0
      col3 = 0
   end
end

This seems like some nice syntactic sugar in-line with the rest of Tidier.

kdpsingh commented 8 months ago

I love this idea. It should be doable as long as I can figure out how begin-end blocks look in the abstract syntax tree.

When I started working on Tidier.jl a year ago, I remember seeing this functionality in DataFramesMeta and not fully grasping its purpose. I definitely get it now. Will put this on the list of things to work on.

kdpsingh commented 8 months ago

Hi @KadeG, this is basically implemented in the latest PR. Just waiting for tests to pass and then may add a new documentation page to highlight this new optional syntax before merging. The nice thing is that it is implemented for every macro that supports multiple expressions. Thanks again for the suggestion!

KadeG commented 8 months ago

Wonderful! I'll play with it and let you know if I notice anything strange. I'm happy to help with docs.

kdpsingh commented 8 months ago

Added some examples of this behavior to the @mutate and @summarize docstrings. This functionality works throughout the package and supports all macros that accept multiple expressions.

I ended up adding a more general page on how piping works.

Take a look: https://tidierorg.github.io/TidierData.jl/latest/examples/generated/UserGuide/piping/

I highlight the support for begin-end blocks towards to end of the text on that page.