JuliaData / Feather.jl

Read and write feather files in pure Julia
https://juliadata.github.io/Feather.jl/stable
Other
109 stars 27 forks source link

bad behavior when using some forms of `Data.stream!` #69

Open ExpandingMan opened 6 years ago

ExpandingMan commented 6 years ago

I'm getting a few errors when doing Data.stream!(src, snk) where snk = Feather.Sink("filename.feather", Data.schema(src)) and then calling Data.close!. There does not seem to be a simple fix. It is really nice to be able to use this form, because it is the easiest case for writing API's that can use a variety of different sinks.

See the following MWE

using DataFrames
using Feather

df = DataFrame(A=Union{DateTime,Missing}[DateTime(), DateTime(), missing], B=rand(3), C=rand(Int64,3))

snk = Feather.Sink("test1.feather", Data.schema(df))                                    

Data.stream!(df, snk)

Data.close!(snk)

On the last line, this gives:

ERROR: ArgumentError: reducing over an empty collection is not allowed
Stacktrace:
 [1] mr_empty_iter(::Function, ::Function, ::Base.Generator{Array{Union{DateTime, Missings.Missing},1},Missings.#ismissing}, ::Base.EltypeUnknown) at ./reduce.jl:257
 [2] mapfoldl(::Base.#identity, ::Function, ::Base.Generator{Array{Union{DateTime, Missings.Missing},1},Missings.#ismissing}) at ./reduce.jl:69
 [3] nullcount(::Array{Union{DateTime, Missings.Missing},1}) at /home/expandingman/.julia/v0.6/Feather/src/Feather.jl:329
 [4] close!(::Feather.Sink{DataFrames.DataFrameStream{Tuple{Array{Union{DateTime, Missings.Missing},1},Array{Float64,1},Array{Int64,1}}}}) at /home/expandingman/.julia/v0.6/Feather/src/Feather.jl:465

This seems related to a deeper issue of the snk.df field being an empty DataFramesStream.