TidierOrg / Tidier.jl

Meta-package for data analysis in Julia, modeled after the R tidyverse.
MIT License
515 stars 14 forks source link

Limitations of interpolation in local scope #91

Closed jfb-h closed 1 year ago

jfb-h commented 1 year ago

The interpolation with !! currently only works with variables defined in the global environment. This is a bit at odds with recommended julia coding style, which usually recommends to put code into functions. E.g, a function like this fails with an UndefVarError for regex:

function cleandata(df, regex)
  @chain df begin
    @select(df, across(!!regex, as_integer))
    # ... further operations
  end
end

I know this is not typical R scripting style but having interpolation also work in local scope would probably ease the transition to a more julian style.

kdpsingh commented 1 year ago

Thanks for this comment. I agree that the current limitation of !! interpolation in looking at the global scope isn't a desirable situation. Ideally, it would infer the scope based on the local scope of the user, which might be the global scope but could be a different (eg function) scope.

This limitation exists not because I want to mimic R (bc the R implementation can similarly infer scope) but because it simplifies the implementation by several orders of magnitude. This is because the functions I use to parse the expression may need to modify the interpolated value before inserting it into the expression (bc Tidier.jl uses bare unquoted names to refer to column names and not symbols), and as far as I can tell, this will require some nested quoted expression interpolation, which is quite messy and hard to get right. I initially tried to implement it this way and decided it's too hard to tackle (for now).

The docs are transparent about this being a limitation: https://tidierorg.github.io/Tidier.jl/dev/examples/generated/UserGuide/interpolation/

I'm going to close this for now just given the complexity. However, if I figure out a way to do it, I'll revisit this topic.