JuliaData / DataFrames.jl

In-memory tabular data in Julia
https://dataframes.juliadata.org/stable/
Other
1.73k stars 367 forks source link

Error showing value of type DataFrame #2604

Closed yurivish closed 3 years ago

yurivish commented 3 years ago

Showing a DataFrame with an OffsetArray column fails on Julia 1.5.3 and 1.6-beta1:

julia> using DataFrames, OffsetArrays

julia> o = OffsetArray([1, 2, 3, 4, 6], -2:2)
5-element OffsetArray(::Array{Int64,1}, -2:2) with eltype Int64 with indices -2:2:
 1
 2
 3
 4
 6

julia> DataFrame(col = o)
Error showing value of type DataFrame:
ERROR: BoundsError: attempt to access 5-element OffsetArray(::Array{Int64,1}, -2:2) with eltype Int64 with indices -2:2 at index [3]
Stacktrace:
 [1] throw_boundserror(::OffsetArray{Int64,1,Array{Int64,1}}, ::Tuple{Int64}) at ./abstractarray.jl:541
 [2] checkbounds at ./abstractarray.jl:506 [inlined]
 [3] getindex(::OffsetArray{Int64,1,Array{Int64,1}}, ::Int64) at /Users/yurivish/.julia/packages/OffsetArrays/jtCbt/src/OffsetArrays.jl:297
 [4] getindex at /Users/yurivish/.julia/packages/DataFrames/yqToF/src/dataframe/dataframe.jl:400 [inlined]
 [5] _pretty_tables_highlighter_func(::DataFrame, ::Int64, ::Int64) at /Users/yurivish/.julia/packages/DataFrames/yqToF/src/abstractdataframe/prettytables.jl:13
 [6] _process_cell_text(::PrettyTables.ColumnTable, ::Int64, ::Int64, ::Bool, ::String, ::Int64, ::Int64, ::Crayons.Crayon, ::Symbol, ::Tuple{}, ::Tuple{PrettyTables.Highlighter}) at /Users/yurivish/.julia/packages/PrettyTables/W16qB/src/backends/text/cell_parse.jl:130
 [7] _print_table_data(::IOContext{Base.GenericIOBuffer{Array{UInt8,1}}}, ::PrettyTables.Display, ::PrettyTables.ColumnTable, ::Array{Array{String,1},2}, ::Array{Array{Int64,1},2}, ::Array{Int64,1}, ::Array{Int64,1}, ::Int64, ::Array{Int64,1}, ::Array{Int64,1}, ::Array{Union{NTuple{4,Int64}, Symbol},1}, ::Array{Union{Int64, Symbol},1}, ::Array{Symbol,1}, ::NTuple{4,Char}, ::Tuple{}, ::Symbol, ::Int64, ::Tuple{PrettyTables.Highlighter}, ::Bool, ::Bool, ::Bool, ::PrettyTables.TextFormat, ::Crayons.Crayon, ::Crayons.Crayon, ::Crayons.Crayon) at /Users/yurivish/.julia/packages/PrettyTables/W16qB/src/backends/text/print_aux.jl:204
 [8] _pt_text(::IOContext{REPL.Terminals.TTYTerminal}, ::PrettyTables.PrintInfo; border_crayon::Crayons.Crayon, header_crayon::Crayons.Crayon, subheader_crayon::Crayons.Crayon, rownum_header_crayon::Crayons.Crayon, text_crayon::Crayons.Crayon, omitted_cell_summary_crayon::Crayons.Crayon, autowrap::Bool, body_hlines::Array{Int64,1}, body_hlines_format::Nothing, continuation_row_alignment::Symbol, crop::Symbol, crop_subheader::Bool, crop_num_lines_at_beginning::Int64, columns_width::Int64, display_size::Tuple{Int64,Int64}, equal_columns_width::Bool, ellipsis_line_skip::Int64, highlighters::Tuple{PrettyTables.Highlighter}, hlines::Array{Symbol,1}, linebreaks::Bool, maximum_columns_width::Array{Int64,1}, minimum_columns_width::Int64, newline_at_end::Bool, overwrite::Bool, noheader::Bool, nosubheader::Bool, row_name_crayon::Crayons.Crayon, row_name_header_crayon::Crayons.Crayon, row_number_alignment::Symbol, show_omitted_cell_summary::Bool, sortkeys::Bool, tf::PrettyTables.TextFormat, title_autowrap::Bool, title_crayon::Crayons.Crayon, title_same_width_as_table::Bool, vcrop_mode::Symbol, vlines::Array{Int64,1}) at /Users/yurivish/.julia/packages/PrettyTables/W16qB/src/backends/text/print.jl:479
 [9] #_pt#68 at /Users/yurivish/.julia/packages/PrettyTables/W16qB/src/private.jl:422 [inlined]
 [10] _pretty_table(::IOContext{REPL.Terminals.TTYTerminal}, ::DataFrame, ::Array{String,2}; kwargs::Base.Iterators.Pairs{Symbol,Any,NTuple{22,Symbol},NamedTuple{(:alignment, :compact_printing, :crop, :crop_num_lines_at_beginning, :ellipsis_line_skip, :formatters, :header_alignment, :hlines, :highlighters, :maximum_columns_width, :newline_at_end, :nosubheader, :row_name_alignment, :row_name_crayon, :row_name_column_title, :row_names, :row_number_alignment, :row_number_column_title, :show_row_number, :title, :vcrop_mode, :vlines),Tuple{Array{Symbol,1},Bool,Symbol,Int64,Int64,Tuple{typeof(DataFrames._pretty_tables_general_formatter),DataFrames.var"#ft_float#557"{Bool,Array{Int64,1},Array{Int64,1}}},Symbol,Array{Symbol,1},Tuple{PrettyTables.Highlighter},Array{Int64,1},Bool,Bool,Symbol,Crayons.Crayon,String,Nothing,Symbol,String,Bool,String,Symbol,Array{Int64,1}}}}) at /Users/yurivish/.julia/packages/PrettyTables/W16qB/src/private.jl:356
 [11] #pretty_table#51 at /Users/yurivish/.julia/packages/PrettyTables/W16qB/src/print.jl:693 [inlined]
 [12] _show(::IOContext{REPL.Terminals.TTYTerminal}, ::DataFrame; allrows::Bool, allcols::Bool, rowlabel::Symbol, summary::Bool, eltypes::Bool, rowid::Nothing, truncate::Int64, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /Users/yurivish/.julia/packages/DataFrames/yqToF/src/abstractdataframe/show.jl:388
 [13] #show#558 at /Users/yurivish/.julia/packages/DataFrames/yqToF/src/abstractdataframe/show.jl:480 [inlined]
 [14] show(::IOContext{REPL.Terminals.TTYTerminal}, ::DataFrame) at /Users/yurivish/.julia/packages/DataFrames/yqToF/src/abstractdataframe/show.jl:480
 [15] #show#573 at /Users/yurivish/.julia/packages/DataFrames/yqToF/src/abstractdataframe/io.jl:50 [inlined]
 [16] show(::IOContext{REPL.Terminals.TTYTerminal}, ::MIME{Symbol("text/plain")}, ::DataFrame) at /Users/yurivish/.julia/packages/DataFrames/yqToF/src/abstractdataframe/io.jl:50
 [17] display(::REPL.REPLDisplay, ::MIME{Symbol("text/plain")}, ::Any) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:214
 [18] display(::REPL.REPLDisplay, ::Any) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:218
 [19] display(::Any) at ./multimedia.jl:328
 [20] #invokelatest#1 at ./essentials.jl:710 [inlined]
 [21] invokelatest at ./essentials.jl:709 [inlined]
 [22] print_response(::IO, ::Any, ::Bool, ::Bool, ::Any) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:238
 [23] print_response(::REPL.AbstractREPL, ::Any, ::Bool, ::Bool) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:223
 [24] (::REPL.var"#do_respond#54"{Bool,Bool,REPL.var"#64#73"{REPL.LineEditREPL,REPL.REPLHistoryProvider},REPL.LineEditREPL,REPL.LineEdit.Prompt})(::Any, ::Any, ::Any) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:822
 [25] #invokelatest#1 at ./essentials.jl:710 [inlined]
 [26] invokelatest at ./essentials.jl:709 [inlined]
 [27] run_interface(::REPL.Terminals.TextTerminal, ::REPL.LineEdit.ModalInterface, ::REPL.LineEdit.MIState) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/REPL/src/LineEdit.jl:2355
 [28] run_frontend(::REPL.LineEditREPL, ::REPL.REPLBackendRef) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:1144
 [29] (::REPL.var"#38#42"{REPL.LineEditREPL,REPL.REPLBackendRef})() at ./task.jl:356

This seems to be due to a bug in the PrettyTables package. A quick test of that package directly incorrectly shows the contents of the same array (but doesn't crash):

julia> using PrettyTables

julia> pretty_table(o)
┌────────┐
│ Col. 1 │
├────────┤
│      4 │
│      6 │
│ #undef │
│ #undef │
│ #undef │
└────────┘
bkamins commented 3 years ago

OffsetArrays.jl is not supported by DataFrames.jl (nor by PrettyTables.jl but here @ronisbr should comment). We will produce a correct error message in 1.0 release after https://github.com/JuliaData/DataFrames.jl/pull/2594 is merged.

ronisbr commented 3 years ago

Yes, unfortunately I cannot support OffsetArrays inside PrettyTables currently. Maybe in the future, but it will really require a lot of modifications.

yurivish commented 3 years ago

Thanks for the note and PR @bkamins, I appreciate it. I asked on Slack before I posted this issue and several folks told me that there were no restrictions on what a column can be beyond being an AbstractVector.

bkamins commented 3 years ago

In the future we might lift this restriction, but for now it is too much buried deep in the package that indexing starts with 1.

yurivish commented 3 years ago

Documenting that and throwing an error when someone tries to use an array that doesn't follow the restrictions seems perfectly acceptable and completely reasonable to me.

So far I've had a really positive experience with DataFrames relative to other packages and I think this has to do with the effort you and the other contributors have put into correctness and consistency. Thanks for all of your work!