GiovineItalia / Compose.jl

Declarative vector graphics
http://giovineitalia.github.io/Compose.jl/latest/
Other
248 stars 83 forks source link

Colors lost when using large vectors... #435

Open hdavid16 opened 1 year ago

hdavid16 commented 1 year ago

An issue came up in GraphPlot.jl that turns out to actually be an issue in Compose.jl (https://github.com/JuliaGraphs/GraphPlot.jl/issues/156)

MWE: Make a plot with many red circles

n = 10
g = compose(
    context(), 
    Compose.circle(rand(n),rand(n),rand(n)/100), 
    fill([colorant"red" for i in 1:n]), 
    Compose.stroke(nothing)
)
image

The saving g (e.g., to PDF) or just displaying it on an IDE, looks the same as snapshot above.

However, if you increase the number of circles to 100: You get the correct plot on an IDE, but when saving it to PDF or redisplaying g, the color is lost. Inital plot with 100 circles:

image

Calling g in the REPL again or saving as say PDF, the red color switches over to black:

image

However, if the stroke is replaced with a Vector of the same size as the fill, the color is not lost when saving or redisplaying the figure.

n=100
g=compose(
    context(), 
    Compose.circle(rand(n),rand(n),rand(n)/100), 
    fill([colorant"red" for i in 1:n]), 
    Compose.stroke([nothing for i in 1:n])
)

There seems to be a glich when vectors are used in some functions, but not in others. But this only occurs when the size of the vectors is large. This behavior is observed with other functions such as fontsize and linewidth.

Mattriks commented 1 year ago

Example 2 below works!

Example 1

n = 100
@btime g1 = compose(context(),
    (context(), Compose.circle(rand(n),rand(n),rand(n)/100), fill(["red" for i in 1:n]), stroke(nothing))
)

39.532 μs (635 allocations: 26.56 KiB)

Example 2

@btime g2 = compose(context(), stroke(nothing),
    (context(), Compose.circle(rand(n),rand(n),rand(n)/100), fill("red")),
    (context(), Compose.circle(rand(n),rand(n),rand(n)/100), fill("gray"))
)

6.579 μs (464 allocations: 22.78 KiB)

hdavid16 commented 1 year ago

Yes. Example 2 works. The issue is that in GraphPlot, the recipes have several properties for each group (stroke, fontsize, etc), so we see the error when networks have many nodes or edges, but not all properties are given vector inputs.

Are you saying that this is not an issue with Compose, but an issue with gc?

Mattriks commented 1 year ago

For example, consider Geom.line in Gadfly (link to Geom.line code). It takes a "subgroup first, plot later" approach.

Also did you try saving the image before you plot it in a notebook e.g.:

draw(PDF("test.pdf"), g)
draw(PNG(), g)
hdavid16 commented 1 year ago

My concern is what is at the root of the problem? What is causing the fill color to disappear?

This issue arises from the fact that GraphPlot.jl uses Compose to generate a graph plot:

compose(
            context(units=UnitBox(-1.2, -1.2, +2.4, +2.4)),
            compose(context(), texts, fill(nodelabelc), fontsize(nodelabelsize)),
            compose(context(), nodes, fill(nodefillc), stroke(nodestrokec), linewidth(nodestrokelw)),
            compose(context(), edgetexts, fill(edgelabelc), fontsize(edgelabelsize)),
            compose(context(), arrows, stroke(edgestrokec), linewidth(edgelinewidth)),
            compose(context(), lines, stroke(edgestrokec), linewidth(edgelinewidth)),
            compose(context(), rectangle(-1.2, -1.2, +2.4, +2.4), fill(background_color)))

So if say the fill of the nodes in the graph is a vector, but the linewidth AND nodestroc color are not vectors, then the colors disappear after 100 nodes. We could make the plot longer by adding each node with its stroke and linewidth in a loop, but I'm not sure why we this strange behavior occurs after 100 nodes.

Mattriks commented 1 year ago

I would have to investigate what the root cause is. The general issue here is that because of the design of Compose, property vectors are expensive (so the solution might turn out to be "Compose 2"). Out of interest, what is on average length(unique(nodefillc)).

hdavid16 commented 1 year ago

I see. It really depends on the application. Some networks can have a few nodes, others can also be in the hundreds. You could have 100 nodes with only 2 colors, but the parameter would be a 100 element vector (2 unique values). The number of unique colors should typically be small (less than 10, otherwise the colors are just confusing)

I find that if I make all the parameters for the nodes to be vectors of the same length, the issue disappears and the color is preserved. So I don't know if it could have anything to do with broadcasting...

Mattriks commented 1 year ago

Yes I was asking about the number of unique colors. Since the (no. of colors)/(no. of points) is small, then having a context per color sounds more Composesque (it is also the way that backends such as svg were designed ie one fill color per group is better than a fill color for each point).

hdavid16 commented 1 year ago

I see. We might rethink the way we build our plots then. Thanks for the insight.

Another workaround I've found is to directly pipe the plot to SVG rather than creating it and then saving it as SVG. When I do this, the color isn't lost.