JuliaData / DataFrames.jl

In-memory tabular data in Julia
https://dataframes.juliadata.org/stable/
Other
1.72k stars 367 forks source link

Type inference bug in combine #2351

Closed bkamins closed 4 years ago

bkamins commented 4 years ago

This happens on nightly Julia Version 1.6.0-DEV.584, Commit 371bfa89d4 (2020-08-05 00:49 UTC):

julia> Random.seed!(1);

julia> df = DataFrame(a = rand([1:5;missing], 20), x1 = rand(1:100, 20),
                             x2 = rand(1:100, 20) +im*rand(1:100, 20));

julia> df.x3 = CategoricalVector{Union{Missing, Int}}(df.x1);

julia> gd = groupby(df, :a);

julia> combine(gd, :x3 => maximum∘skipmissing => :y)
ERROR: fatal error in type inference (type bound)
Stacktrace:
 [1] groupreduce(f::Function, op::Function, condf::Base.var"#66#67"{typeof(ismissing)}, adjust::Nothing, checkempty::Bool, incol::CategoricalVector{Union{Missing, Int64},UInt32,Int64,CategoricalValue{Int64,UInt32},Missing}, gd::GroupedDataFrame{DataFrame})
   @ DataFrames ~/.julia/dev/DataFrames/src/groupeddataframe/splitapplycombine.jl:999
 [2] (::DataFrames.Reduce{typeof(max),Base.var"#66#67"{typeof(ismissing)},Nothing})(incol::CategoricalVector{Union{Missing, Int64},UInt32,Int64,CategoricalValue{Int64,UInt32},Missing}, gd::GroupedDataFrame{DataFrame})
   @ DataFrames ~/.julia/dev/DataFrames/src/groupeddataframe/splitapplycombine.jl:1003
 [3] _combine(f::Vector{Pair}, gd::GroupedDataFrame{DataFrame}, nms::Vector{Symbol}, copycols::Bool, keeprows::Bool)
   @ DataFrames ~/.julia/dev/DataFrames/src/groupeddataframe/splitapplycombine.jl:1129
 [4] combine_helper(f::Vector{Pair}, gd::GroupedDataFrame{DataFrame}, nms::Vector{Symbol}; keepkeys::Bool, ungroup::Bool, copycols::Bool, keeprows::Bool)
   @ DataFrames ~/.julia/dev/DataFrames/src/groupeddataframe/splitapplycombine.jl:586
 [5] _combine_prepare(gd::GroupedDataFrame{DataFrame}, cs::Union{Colon, typeof(nrow), Regex, AbstractString, Signed, Symbol, Unsigned, Pair, AbstractVector{T} where T, All, Between, InvertedIndex, StridedArray{T,1} where T, BitVector}; keepkeys::Bool, ungroup::Bool, copycols::Bool, keeprows::Bool)
   @ DataFrames ~/.julia/dev/DataFrames/src/groupeddataframe/splitapplycombine.jl:551
 [6] #combine#401
   @ ~/.julia/dev/DataFrames/src/groupeddataframe/splitapplycombine.jl:472 [inlined]
 [7] combine(gd::GroupedDataFrame{DataFrame}, cs::Pair{Symbol,Pair{Base.ComposedFunction{typeof(maximum),typeof(skipmissing)},Symbol}})
   @ DataFrames ~/.julia/dev/DataFrames/src/groupeddataframe/splitapplycombine.jl:472
 [8] top-level scope
   @ REPL[41]:1

CC @nalimilan

bkamins commented 4 years ago

~It does not happen on 0.21.5 release (also we have this problem only on Julia nightly, so I will go ahead with 0.21.6 patch release), so the issue is most likely caused by https://github.com/JuliaData/DataFrames.jl/pull/2335, CC @quinnj.~

bkamins commented 4 years ago

Also - interestingly - it is only on Linux and osX, on Windows we do not have it.

bkamins commented 4 years ago

OK - I cannot register a new 0.21.6 release due to https://discourse.julialang.org/t/pkg-downtime-incident/44288 so in the meantime maybe this issue can be tracked down

quinnj commented 4 years ago

Does it work if we revert that fix commit?

bkamins commented 4 years ago

Ha - good point. No - I have just checked that it does not work. It does not work on 0.21.5 either, so the reason is deeper (I have earlier checked 0.21.5 by a mistake not on nightly - the bane of having many Julia installations on one machine)

bkamins commented 4 years ago

I will open an issue on Julia then

bkamins commented 4 years ago

https://github.com/JuliaLang/julia/issues/36923

bkamins commented 4 years ago

closing - tracked in Julia Base