Closed jjgomezcadenas closed 1 year ago
Following on my previous post, this worsk:
begin
grouped_df = groupby(x, :id)
combine(grouped_df, x -> size(x)[1] < 2 ? DataFrame() : x)
end
but not the original example in:
https://stackoverflow.com/questions/66484426/remove-groups-by-condition
which used nrow(x) rather than size(x)[1]
Can you please double-check that you have Pluto properly configured? I just run the example in REPL and it works as expected:
julia> combine(groupby(x, :id)) do sdf
n = nrow(sdf)
n < 25 ? DataFrame() : DataFrame(n=n) # drop groups with low number of rows
end
2×2 DataFrame
Row │ id n
│ Char Int64
─────┼─────────────
1 │ d 31
2 │ c 29
So, it may be a bug with Pluto. I tried in the REPL and it works. But in a fresh Pluto notebook with just these dependences
begin
import Pkg
Pkg.activate(mktempdir())
Pkg.add([Pkg.PackageSpec(name="DataFrames", version="1.5.0")])
using DataFrames
end
it does not work
x = DataFrame(id=rand('a':'d', 100), v=rand(100))
combine(groupby(x, :id)) do sdf
n = nrow(sdf)
n < 25 ? DataFrame() : DataFrame(n=n) # drop groups with low number of rows
end
UndefVarError: nrow
not defined
_combine(::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Vector{Any}, ::Vector{Bool}, ::Bool, ::Bool, ::Bool, ::Bool)@splitapplycombine.jl:754 _combine_prepare_norm(::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Vector{Any}, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool)@splitapplycombine.jl:86 var"#_combine_prepare#701"(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(DataFrames._combine_prepare), ::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Base.RefValue{Any})@splitapplycombine.jl:51 _combine_prepare@splitapplycombine.jl:25[inlined]
combine@splitapplycombine.jl:860[inlined]
combine@splitapplycombine.jl:839[inlined] top-level scope@Local: 1[inlined]
combine(groupby(x, :id)) do sdf
n = size(sdf)[1]
n < 25 ? DataFrame() : DataFrame(n=n) # drop groups with low number of rows
end
UndefVarError: DataFrame
not defined
_combine(::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Vector{Any}, ::Vector{Bool}, ::Bool, ::Bool, ::Bool, ::Bool)@splitapplycombine.jl:754 _combine_prepare_norm(::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Vector{Any}, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool)@splitapplycombine.jl:86 var"#_combine_prepare#701"(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(DataFrames._combine_prepare), ::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Base.RefValue{Any})@splitapplycombine.jl:51 _combine_prepare@splitapplycombine.jl:25[inlined]
combine@splitapplycombine.jl:860[inlined]
combine@splitapplycombine.jl:839[inlined] top-level scope@Local: 1
OK, I removed and reinstall Pluto, recompiled everything, this time works. Sorry about the hassle and thanks!
Thank you! Still @fonsp might know about this issue, as maybe this is some general problem in Pluto.jl.
What Pluto version are you using? I cannot reproduce https://github.com/bkamins/Julia-DataFrames-Tutorial/issues/37#issuecomment-1548322647 on latest Pluto, Julia 1.9.0
I am using now Julia 1.9 latest version of Pluto. Under this conditions, the code runs. Previously I may had some incompatibility, I have moved to 1.9 but not reinstalled Pluto. What I did was to remove Pluto, install it again, precompile and it works now
Hello and thanks for these wonderful tutorials.
I was trying to run the above example (I use Pluto, julia 1.9 and DataFrames 1.5.0)
And got the following error:
UndefVarError:
nrow
not defined_combine(::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Vector{Any}, ::Vector{Bool}, ::Bool, ::Bool, ::Bool, ::Bool)@splitapplycombine.jl:739 _combine_prepare_norm(::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Vector{Any}, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool)@splitapplycombine.jl:86 var"#_combine_prepare#671"(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(DataFrames._combine_prepare), ::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Base.RefValue{Any})@splitapplycombine.jl:51 _combine_prepare@splitapplycombine.jl:25[inlined]
combine#737@splitapplycombine.jl:845[inlined]
combine@splitapplycombine.jl:845[inlined]
combine#735@splitapplycombine.jl:830[inlined]
combine@splitapplycombine.jl:824[inlined] top-level scope@Local: 1[inlined]
The I tried to fix it:
And got this.
UndefVarError:
DataFrame
not defined_combine(::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Vector{Any}, ::Vector{Bool}, ::Bool, ::Bool, ::Bool, ::Bool)@splitapplycombine.jl:739 _combine_prepare_norm(::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Vector{Any}, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool)@splitapplycombine.jl:86 var"#_combine_prepare#671"(::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::typeof(DataFrames._combine_prepare), ::DataFrames.GroupedDataFrame{DataFrames.DataFrame}, ::Base.RefValue{Any})@splitapplycombine.jl:51 _combine_prepare@splitapplycombine.jl:25[inlined]
combine#737@splitapplycombine.jl:845[inlined]
combine@splitapplycombine.jl:845[inlined]
combine#735@splitapplycombine.jl:830[inlined]
combine@splitapplycombine.jl:824[inlined] top-level scope@Local: 1[inlined]
The example if of particular interest for me, since my problem is precisely to drop groups with less than certain number of rows
Thanks!