TidierOrg / TidierData.jl

Tidier data transformations in Julia, modeled after the dplyr/tidyr R packages.
MIT License
86 stars 7 forks source link

v.16.1 unable to `@group_by` bare column name #109

Closed drizk1 closed 2 months ago

drizk1 commented 2 months ago
julia> using TidierData

(@v1.9) pkg> status TidierData
Status `~/.julia/environments/v1.9/Project.toml`
  [fe2206b3] TidierData v0.16.1

julia> df = DataFrame(
                  a = ["a", "b","a", "b", "a","b" ,"a", "b"],
                  b = [0.3, 2, missing, 3, 6, 5, 7, 7],
                  c = [0.2, 0.2, 0.2, missing, 1, missing, 5, 6]);

julia> @group_by df a 
ERROR: UndefVarError: `a` not defined
Stacktrace:
 [1] top-level scope
   @ ~/.julia/packages/TidierData/ZkLm0/src/TidierData.jl:448

julia> @group_by df :a
GroupedDataFrame with 2 groups based on key: a
First Group (4 rows): a = "a"
 Row │ a       b          c        
     │ String  Float64?   Float64? 
─────┼─────────────────────────────
   1 │ a             0.3       0.2
   2 │ a       missing         0.2
   3 │ a             6.0       1.0
   4 │ a             7.0       5.0
⋮
Last Group (4 rows): a = "b"
 Row │ a       b         c         
     │ String  Float64?  Float64?  
─────┼─────────────────────────────
   1 │ b            2.0        0.2
   2 │ b            3.0  missing   
   3 │ b            5.0  missing   
   4 │ b            7.0        6.0

i think the issue comes from this part of parse_group_by where its never making it to return QuoteNode(tidy_expr) because its always an Expr so it exits at that elseif because if i switch to the below it will run this fix still allows expression ie aa = b+1 and just a to work. the way it is now currently does not work for either

elseif tidy_expr isa Expr
    return QuoteNode(tidy_expr) # works
  elseif tidy_expr isa Expr
    return tidy_expr #doesnt work
  else # if it's a Symbol
    return QuoteNode(tidy_expr) 
drizk1 commented 2 months ago

oddly this is not an issue with .16.1#slice_fix ? not sure what was going on so i am going to close this issue

    Updating git-repo `https://github.com/TidierOrg/TidierData.jl`
   Resolving package versions...
    Updating `~/.julia/environments/v1.9/Project.toml`
  [fe2206b3] ~ TidierData v0.16.1 ⇒ v0.16.1 `https://github.com/TidierOrg/TidierData.jl#slice_fix`
    Updating `~/.julia/environments/v1.9/Manifest.toml`
  [fe2206b3] ~ TidierData v0.16.1 ⇒ v0.16.1 `https://github.com/TidierOrg/TidierData.jl#slice_fix`
Precompiling project...
  1 dependency successfully precompiled in 8 seconds. 680 already precompiled.

julia> using TidierData

julia> df = DataFrame(
                         a = ["a", "b","a", "b", "a","b" ,"a", "b"],
                         b = [0.3, 2, missing, 3, 6, 5, 7, 7],
                         c = [0.2, 0.2, 0.2, missing, 1, missing, 5, 6]);

julia> @group_by df a
GroupedDataFrame with 2 groups based on key: a
First Group (4 rows): a = "a"
 Row │ a       b          c        
     │ String  Float64?   Float64? 
─────┼─────────────────────────────
   1 │ a             0.3       0.2
   2 │ a       missing         0.2
   3 │ a             6.0       1.0
   4 │ a             7.0       5.0
⋮
Last Group (4 rows): a = "b"
 Row │ a       b         c         
     │ String  Float64?  Float64?  
─────┼─────────────────────────────
   1 │ b            2.0        0.2
   2 │ b            3.0  missing   
   3 │ b            5.0  missing   
   4 │ b            7.0        6.0

julia> @group_by df aa = b+1
GroupedDataFrame with 7 groups based on key: aa
First Group (1 row): aa = 1.3
 Row │ a       b         c         aa       
     │ String  Float64?  Float64?  Float64? 
─────┼──────────────────────────────────────
   1 │ a            0.3       0.2       1.3
⋮
Last Group (2 rows): aa = 8.0
 Row │ a       b         c         aa       
     │ String  Float64?  Float64?  Float64? 
─────┼──────────────────────────────────────
   1 │ a            7.0       5.0       8.0
   2 │ b            7.0       6.0       8.0

(@v1.9) pkg> status TidierData
Status `~/.julia/environments/v1.9/Project.toml`
  [fe2206b3] TidierData v0.16.1 `https://github.com/TidierOrg/TidierData.jl#slice_fix`