GiovineItalia / Gadfly.jl

Crafty statistical graphics for Julia.
http://gadflyjl.org/stable/
Other
1.9k stars 250 forks source link

BoundsError with Geom.subplot_grid(Geom.path) and group, xgroup aesthetics #1086

Open aterenin opened 6 years ago

aterenin commented 6 years ago

I'm porting some ggplot code that works in R to Gadfly. With a sufficiently large DataFrame, I run the following code.

plot(d_p2, x=:runtime,y=:loglik, group=:group, xgroup=:dataset,
  Geom.subplot_grid(Geom.path)
  )

This throws the following error with a rather unhelpful stack trace.

ERROR: BoundsError: attempt to access 1-element Array{UInt32,1} at index [102]
Stacktrace:
 [1] permute!!(::Array{UInt32,1}, ::Array{Int64,1}) at ./combinatorics.jl:94
 [2] permute!! at /Users/aterenin/.julia/v0.6/DataArrays/src/pooleddataarray.jl:540 [inlined]
 [3] permute!(::DataArrays.PooledDataArray{ColorTypes.RGBA{Float32},UInt32,1}, ::Array{Int64,1}) at ./combinatorics.jl:132
 [4] render(::Gadfly.Geom.LineGeometry, ::Gadfly.Theme, ::Gadfly.Aesthetics) at /Users/aterenin/.julia/v0.6/Gadfly/src/geom/line.jl:96
 [5] render(::Gadfly.Geom.LineGeometry, ::Gadfly.Theme, ::Gadfly.Aesthetics, ::Array{Gadfly.Aesthetics,1}, ::Array{Gadfly.Data,1}, ::Dict{Symbol,Gadfly.ScaleElement}) at /Users/aterenin/.julia/v0.6/Gadfly/src/geometry.jl:42
 [6] (::Gadfly.##117#119{Dict{Symbol,Gadfly.ScaleElement}})(::Tuple{Gadfly.Layer,Gadfly.Aesthetics,Array{Gadfly.Aesthetics,1},Array{Gadfly.Data,1},Gadfly.Theme}) at ./<missing>:0
 [7] collect(::Base.Generator{Base.Iterators.Zip{Array{Gadfly.Layer,1},Base.Iterators.Zip{Array{Gadfly.Aesthetics,1},Base.Iterators.Zip{Array{Array{Gadfly.Aesthetics,1},1},Base.Iterators.Zip2{Array{Array{Gadfly.Data,1},1},Array{Gadfly.Theme,1}}}}},Gadfly.##117#119{Dict{Symbol,Gadfly.ScaleElement}}}) at ./array.jl:441
 [8] #render_prepared#115(::Bool, ::Bool, ::Function, ::Gadfly.Plot, ::Gadfly.Coord.Cartesian, ::Gadfly.Aesthetics, ::Array{Gadfly.Aesthetics,1}, ::Array{Array{Gadfly.StatisticElement,1},1}, ::Array{Array{Gadfly.Aesthetics,1},1}, ::Array{Array{Gadfly.Data,1},1}, ::Dict{Symbol,Gadfly.ScaleElement}, ::Array{Gadfly.GuideElement,1}) at /Users/aterenin/.julia/v0.6/Gadfly/src/Gadfly.jl:817
 [9] (::Gadfly.#kw##render_prepared)(::Array{Any,1}, ::Gadfly.#render_prepared, ::Gadfly.Plot, ::Gadfly.Coord.Cartesian, ::Gadfly.Aesthetics, ::Array{Gadfly.Aesthetics,1}, ::Array{Array{Gadfly.StatisticElement,1},1}, ::Array{Array{Gadfly.Aesthetics,1},1}, ::Array{Array{Gadfly.Data,1},1}, ::Dict{Symbol,Gadfly.ScaleElement}, ::Array{Gadfly.GuideElement,1}) at ./<missing>:0
 [10] render(::Gadfly.Geom.SubplotGrid, ::Gadfly.Theme, ::Gadfly.Aesthetics, ::Array{Gadfly.Aesthetics,1}, ::Array{Gadfly.Data,1}, ::Dict{Symbol,Gadfly.ScaleElement}) at /Users/aterenin/.julia/v0.6/Gadfly/src/geom/subplot.jl:311
 [11] (::Gadfly.##117#119{Dict{Symbol,Gadfly.ScaleElement}})(::Tuple{Gadfly.Layer,Gadfly.Aesthetics,Array{Gadfly.Aesthetics,1},Array{Gadfly.Data,1},Gadfly.Theme}) at ./<missing>:0
 [12] collect(::Base.Generator{Base.Iterators.Zip{Array{Gadfly.Layer,1},Base.Iterators.Zip{Array{Gadfly.Aesthetics,1},Base.Iterators.Zip{Array{Array{Gadfly.Aesthetics,1},1},Base.Iterators.Zip2{Array{Array{Gadfly.Data,1},1},Array{Gadfly.Theme,1}}}}},Gadfly.##117#119{Dict{Symbol,Gadfly.ScaleElement}}}) at ./array.jl:441
 [13] #render_prepared#115(::Bool, ::Bool, ::Function, ::Gadfly.Plot, ::Gadfly.Coord.SubplotGrid, ::Gadfly.Aesthetics, ::Array{Gadfly.Aesthetics,1}, ::Array{Array{Gadfly.StatisticElement,1},1}, ::Array{Array{Gadfly.Aesthetics,1},1}, ::Array{Array{Gadfly.Data,1},1}, ::Dict{Symbol,Gadfly.ScaleElement}, ::Array{Gadfly.GuideElement,1}) at /Users/aterenin/.julia/v0.6/Gadfly/src/Gadfly.jl:817
 [14] render_prepared(::Gadfly.Plot, ::Gadfly.Coord.SubplotGrid, ::Gadfly.Aesthetics, ::Array{Gadfly.Aesthetics,1}, ::Array{Array{Gadfly.StatisticElement,1},1}, ::Array{Array{Gadfly.Aesthetics,1},1}, ::Array{Array{Gadfly.Data,1},1}, ::Dict{Symbol,Gadfly.ScaleElement}, ::Array{Gadfly.GuideElement,1}) at /Users/aterenin/.julia/v0.6/Gadfly/src/Gadfly.jl:806
 [15] render(::Gadfly.Plot) at /Users/aterenin/.julia/v0.6/Gadfly/src/Gadfly.jl:752
 [16] display(::Base.REPL.REPLDisplay{Base.REPL.LineEditREPL}, ::MIME{Symbol("text/html")}, ::Gadfly.Plot) at /Users/aterenin/.julia/v0.6/Gadfly/src/Gadfly.jl:1062
 [17] display(::Base.REPL.REPLDisplay{Base.REPL.LineEditREPL}, ::Gadfly.Plot) at /Users/aterenin/.julia/v0.6/Gadfly/src/Gadfly.jl:1007
 [18] display(::Gadfly.Plot) at ./multimedia.jl:194
 [19] hookless(::Media.##7#8{Gadfly.Plot}) at /Users/aterenin/.julia/v0.6/Media/src/compat.jl:14
 [20] render(::Media.NoDisplay, ::Gadfly.Plot) at /Users/aterenin/.julia/v0.6/Media/src/compat.jl:27
 [21] display(::Media.DisplayHook, ::Gadfly.Plot) at /Users/aterenin/.julia/v0.6/Media/src/compat.jl:9
 [22] display(::Gadfly.Plot) at ./multimedia.jl:194
 [23] eval(::Module, ::Any) at ./boot.jl:235
 [24] print_response(::Base.Terminals.TTYTerminal, ::Any, ::Void, ::Bool, ::Bool, ::Void) at ./REPL.jl:144
 [25] print_response(::Base.REPL.LineEditREPL, ::Any, ::Void, ::Bool, ::Bool) at ./REPL.jl:129
 [26] (::Base.REPL.#do_respond#16{Bool,Base.REPL.##26#36{Base.REPL.LineEditREPL,Base.REPL.REPLHistoryProvider},Base.REPL.LineEditREPL,Base.LineEdit.Prompt})(::Base.LineEdit.MIState, ::Base.AbstractIOBuffer{Array{UInt8,1}}, ::Bool) at ./REPL.jl:646

I tried to come up with a minimum working example, but was unable to reproduce the problem for small data frames. There shouldn't be anything weird about my CSV - it works correctly in R. I'd be happy to provide it for further examination.

tlnagy commented 6 years ago

How big is the dataframe? Can you upload it here?

Can you give us the output of eltypes(d_p2)? Also, does using a random DataFrame of the same size cause the problem or it's just this specific DataFrame when large enough?

i.e. does using DataFrame(rand(size(d_p2))) cause the problems too?

It's kinda hard to diagnose with the current info. :/

aterenin commented 6 years ago

Here you go.

https://www.dropbox.com/s/q4wdo6qplvt9d3a/logmargpost-enron-10.csv?dl=1 https://www.dropbox.com/s/8g69br7rsfos1jz/logmargpost-enron-100.csv?d=1

Import with

d_p2 = vcat(
  CSV.read("experiments/logmargpost-enron-10.csv", nullable=false) |> x -> begin x[:dataset] = repeat("Enron\nK=10", size(x)[1]); x end,
  CSV.read("experiments/logmargpost-enron-100.csv", nullable=false) |> x -> begin x[:dataset] = repeat("Enron\nK=100", size(x)[1]); x end
)
aterenin commented 6 years ago
eltypes(d_p2)
String
Int64
Float64
Float64
Int64
Int64
String
String
String
tlnagy commented 6 years ago

Hi @aterenin, do you still have these CSVs available? The links appear to be dead.

aterenin commented 6 years ago

Re-uploaded.

https://www.dropbox.com/s/lbreovxavndy8gv/logmargpost-enron-10.csv?dl=0 https://www.dropbox.com/s/07mfq6fbnxtqncv/logmargpost-enron-100.csv?dl=0

Please let me know when you've grabbed them so I can take the links down.

Mattriks commented 5 years ago

Can someone retest this on Gadfly 1.0.1?

bjarthur commented 5 years ago

i would re-test, but the links are dead again.