JuliaPlots / Plots.jl

Powerful convenience for Julia visualizations and data analysis
https://docs.juliaplots.org
Other
1.84k stars 355 forks source link

[BUG] Unexpected behavior when plotting seriestype shape + 3x speed regression compared to 1.10.6 #3384

Open yygrechka opened 3 years ago

yygrechka commented 3 years ago

Details

The following code:

plot(
    [
        Shape([1,1], [1,2]), 
        Shape([2,2], [1,2]),
        Shape([3,3], [1,2]),
        Shape([4,4], [1,2])
    ],
    seriestype=:shape,
    linecolor=[:red, :green, :red, :green]
)

issues a warning:

┌ Warning: Column vector attribute `linecolor` reinterpreted as row vector (one value per shape).
│ Pass a row vector instead (e.g. using `permutedims`) to suppress this warning.
└ @ Plots /home/yevgeniy/.julia/packages/Plots/z5Msu/src/recipes.jl:1381

This did not occur in 1.10.6, and also in 1.10.6 using a row vector for linecolor did not result in the correct behavior. Although in 1.11.0, I can change the code to the following:

plot(
    [
        Shape([1,1], [1,2]), 
        Shape([2,2], [1,2]),
        Shape([3,3], [1,2]),
        Shape([4,4], [1,2])
    ],
    seriestype=:shape,
    linecolor=permutedims([:red, :green, :red, :green])
)

and this gets rid of the warning, however, in my longer projects, both my old code that uses the column vector as well as using permutedims results in an unacceptable performance regression compared to 1.10.6. (Also, I find it really strange that the linecolor array needs to be a row vector, but the main vector of shapes remains a column vector.)

Benchmarking the above code:

function f1()
plot(
    [
        Shape([1,1], [1,2]), 
        Shape([2,2], [1,2]),
        Shape([3,3], [1,2]),
        Shape([4,4], [1,2])
    ],
    seriestype=:shape,
    linecolor=[:red, :green, :red, :green]
)
end

function f2()
plot(
    [
        Shape([1,1], [1,2]), 
        Shape([2,2], [1,2]),
        Shape([3,3], [1,2]),
        Shape([4,4], [1,2])
    ],
    seriestype=:shape,
    linecolor=permutedims([:red, :green, :red, :green])
)
end
# version 1.11.0
@benchmark f1()
BenchmarkTools.Trial: 
  memory estimate:  472.20 KiB
  allocs estimate:  6975
  --------------
  minimum time:     3.187 ms (0.00% GC)
  median time:      4.031 ms (0.00% GC)
  mean time:        4.415 ms (1.34% GC)
  maximum time:     15.710 ms (67.08% GC)
  --------------
  samples:          1133
  evals/sample:     1

@benchmark f2()
BenchmarkTools.Trial: 
  memory estimate:  469.59 KiB
  allocs estimate:  6932
  --------------
  minimum time:     3.032 ms (0.00% GC)
  median time:      3.132 ms (0.00% GC)
  mean time:        3.252 ms (1.44% GC)
  maximum time:     8.681 ms (54.14% GC)
  --------------
  samples:          1537
  evals/sample:     1
# version 1.10.6
@benchmark f1()
BenchmarkTools.Trial: 
  memory estimate:  182.59 KiB
  allocs estimate:  2645
  --------------
  minimum time:     946.502 μs (0.00% GC)
  median time:      1.080 ms (0.00% GC)
  mean time:        1.067 ms (1.85% GC)
  maximum time:     6.898 ms (73.72% GC)
  --------------
  samples:          4679
  evals/sample:     1

Backends

This bug occurs on ( insert x below )

Backend yes no untested
gr (default) x
pyplot x
plotly x
plotlyjs x
pgfplotsx x
inspectdr x

Versions

Plots.jl version: 1.11.0 Backend version (]st -m): GR v0.55.0 Output of versioninfo(): Julia Version 1.6.0 Commit f9720dc2eb (2021-03-24 12:55 UTC) Platform Info: OS: Linux (x86_64-pc-linux-gnu) CPU: Intel(R) Core(TM) i5-4460S CPU @ 2.90GHz WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-11.0.1 (ORCJIT, haswell)

yha commented 3 years ago

Related to #3320.

(Also, I find it really strange that the linecolor array needs to be a row vector, but the main vector of shapes remains a column vector.)

Note that each shape is a separate series (with a separate legend entry), so this is consistent with the Plots convention that each column corresponds to a series. The old behavior used to be inconsistent when passing x,y separately with seriestype=:shape versus as Shapes. Compare the behavior of

plot([[1,2],[3,4,5]], color=[:red :blue]) # row vector
scatter([4,5], color=[:red, :blue]) # column vector

Also, there's no need to specify seriestype=:shape when you are also passing Shapes as data. And you can pass row vectors using literal syntax with spaces rather than permutedims

linecolor=[:red :green :red :green]

So I think the warning is fine. The performance regression may be a problem, but it's still just 3ms. Are you creating an animation of plots with shapes?

yygrechka commented 3 years ago

I'm using Interact.jl and Mux.jl for interactive plotting in a browser. The plots I have are more complex, so the 3x performance regression is very noticeable.

yha commented 3 years ago

I've looked into the performance regression, and it seems to be caused by the fact that the shapes recipe now assigns a different series to each shape (rather than a single NaN-separated series). This was done to minimize the breaking change in #3320, by preserving the behavior that separate shapes in a vector-of-shapes input can correspond to different elements in attributes (such as linecolor in this example). Apparently drawing the same shapes as a single series is much faster than drawing them as separate series. I'm not sure there's an easy non-breaking fix. Making the overhead of separate series smaller should help.

yygrechka commented 3 years ago

hmm, I suppose that I can write my own shapes recipe that mimics the old behavior.

yha commented 3 years ago

If you do write a recipe that recovers the performance and also preserves the behavior of attributes, I think you should submit it as a PR. This should probably involve some manipulation of attributes (simply copying the old recipe from before #3320 would not preserve attribute behavior because the handling of NaNs has changed).

btmit commented 3 years ago

Copying related from: https://discourse.julialang.org/t/plotting-series-of-shapes/60108/3

Plotting a series of shapes used to (pre 1.11?) result in a single consistent shape color by default. This made sense because it was a single series. Currently Plots seems to create a new color for each shape in the series. Seems related to the bug mentioned here.

ang = range(0, 2π, length = 60)
ellipse(x, y, w, h) = Shape(w*sin.(ang).+x, h*cos.(ang).+y)
myshapes = [ellipse(x,rand(),rand(),rand()) for x = 1:4]
plot(myshapes)