TidierOrg / Tidier.jl

Meta-package for data analysis in Julia, modeled after the R tidyverse.
MIT License
515 stars 14 forks source link

Refactor pivot code to better support options #46

Closed rdboyes closed 1 year ago

rdboyes commented 1 year ago

This refactors the pivot code to use the original intended design - creates a dictionary and passes it to stack or unstack. This should allow support for additional options to be added much more easily

kdpsingh commented 1 year ago

Nice work on this. Using eval() inside of macros is a bad "code smell" and should be reserved for situations where it's warranted. I can help cleaning that up but want to wait until we address that before merging.

kdpsingh commented 1 year ago

Can you speak to what you're trying to accomplish with the eval()?

I won't have a chance to run this on a computer this until the weekend. Not urgent, but may be able to help without opening.

rdboyes commented 1 year ago

Basically just trying to strip off the QuoteNode() wrapper - the Dict needs to be Symbol => Symbol to work properly. Without the eval, the code will pass e.g. :variable_name => :(:letter) instead of :variable_name => :letter to stack(). If there's a better way do accomplish that I can change it!

kdpsingh commented 1 year ago

There's definitely a way to do this without the eval. Hard to explain but I will take a look this weekend.

rdboyes commented 1 year ago

Seems like for this purpose, eval() can just be swapped out for .value - fix is in the latest commit!

kdpsingh commented 1 year ago

That may work. Will look!

Generally if you have a symbol and you want to evaluate it, you have to wrap the whole expression inside another expression (using quote or :()) and then use $ to interpolate the inner expression.

It kind of feels like Inception, where you have to go a layer deeper to interpolate something before you can go back up a level.

kdpsingh commented 1 year ago

Nice work! I noticed that this change also makes it so that @pivot_longer and @pivot_wider both support string values for names_to, values_to, etc.

I had updated the docstrings and the documentation accordingly, and added this to NEWS.md.

In a future update, we should think about supporting multiple column selections using slurping/splatting.

For example, this is currently supported:

@pivot_longer(df_wide, A:B, names_to = "letter", values_to = "number")

But we should also aim to support this (arbitrary number columns):

@pivot_longer(df_wide, A, B, names_to = "letter", values_to = "number")