GiovineItalia / Gadfly.jl

Crafty statistical graphics for Julia.
http://gadflyjl.org/stable/
Other
1.9k stars 250 forks source link

Contour plots? #293

Open tomasaschan opened 10 years ago

tomasaschan commented 10 years ago

If I understand correctly, Gadfly currently focuses on 2D plotting mainly because drawing in 3D is a quite different endeavor. However, there are several types of plots that visualize 3D data on a 2D surface, such as contour plots and color-coded projections like MATLAB's imagesc, that I think would constitute a nice addition to the library.

Are there any plans on adding functionality for such plots to Gadfly, in the near or distant future?

For contour plots, for example, calculating where to draw the contour lines is quite involved (see e.g. this description for a simple overview - one might want to do something even more sophisticated about edge cases) but once the line segments have been identified it should be straight-forward to hook into the existing back-ends to do the actual drawing of colored lines. I guess one could do similar things to create colored projections à la imagesc.

dcjones commented 10 years ago

I do have Geom.rectbin which is a little similar to imagesc. There's also a function called spy that uses Geom.rectbin to plot matrices.

I don't have contour plots but would really love to add that. Gadfly has been in an effective feature freeze lately, just because I've been focusing on overhauling its underlying architecture. When that's complete, I'll try to make this a high priority. It might be a few weeks though.

tomasaschan commented 10 years ago

That sounds terrific =) I'm no expert, and like everyone else I have limited time on my hands to work on this kind of stuff, but if I can I'd love to help out in some way.

tomasaschan commented 10 years ago

I just tried out Geom.rectbin, but it doesn't quite cut it because it requires the data to be square - I have rectangular grid which is even-spaced in x and y but with different spacings (and different number of points), and values f(x,y) for all grid-points, so that f is a matrix with 65x136 elements; I can't figure out how to plot that with Geom.rectbin without getting an error stating

ERROR: Geom.rectbin requires an equal number of x (or xmin/xmax) and y (or ymin/ymax) values.

mykelk commented 10 years ago

I would love contour plots as well. I use PyPlot's contour function extensively in the courses I teach.

darwindarak commented 10 years ago

I've been trying to get a marching squares contouring algorithm working for Gadfly. Here is what I've been able to get so far:

contour

This is still a complete patch job. It essentially creates a new Geom.line layer for each isoline it finds. I have not been able to get the color bar to show. Maybe this is because I've been assigning line color through the default_color option in Themes? I will try to make a cleaner version once I get a better understanding of how Gadfly works internally. Hope this helps!

StefanKarpinski commented 10 years ago

Patch job or not, that looks lovely.

tomasaschan commented 10 years ago

@darwindarak Lovely indeed! Can I just checkout the contour branch of your fork to use this? What does the API look like?

dcjones commented 10 years ago

Great work @darwindarak! I'd love to help you integrate this into Gadfly.

The best way to do so would be to add your marching squares code as a new statistic type, say Stat.contour. Statistics in gadfly take some number of bound aesthetics (e.g. x, y, color), apply some function, and output one or more aesthetics. So in this case, Stat.contour would take as input x, y, and func (an aesthetic which I recently aded), and generate data for the x, y, and color aesthetics. Then, combined with Geom.line, it should just work, color key and all.

Let me know if I can help explain anything.

darwindarak commented 10 years ago

Thanks for the feedback guys!

@tlycken The current contour branch can be used like this. I've only tried the method on a couple of test functions. It would be great if you can help stress test it with more complicated contours. I'm still pretty new at using Julia, so I would really appreciate any comments on the code.

@dcjones One of the problems I ran into was the case when there are separate contour lines for the same level. If I specify the same color values for all the contours and plot them on a single layer, then it shows a color bar and looks like

single_layer

As opposed to

multiple_layers

It it because Geom.line groups the lines based on the color value?

I'll try to add the new statistics type this weekend!

dcjones commented 10 years ago

It it because Geom.line groups the lines based on the color value?

Yeah. Currently the only way to have multiple lines of the same color is with multiple layers. That's come up as an annoyance in the past, so I do need to add a separate mechanism for grouping points into lines. Probably a group aesthetic.

tomasaschan commented 10 years ago

@darwindarak There's still a couple of things that are unclear to me. I figured I'd test it on the application I'm currently working on, but I'm still struggling to figure out exactly how to give the data to your algorithm; mainly, I'm finding it difficult to grasp how to arrange the matrix Z in terms of X and Y correctly.

I know it's not how MATLAB and Python do it, but I think the most intuitive way to do this is to give two 1-dimensional iterable x and y, and a two-dimensional iterable z such that z has length(x) rows and lenght(y) columns, so that if you'd plot z as an image, the lowest row of pixels would be represented by the first column of z. (Side note: CoordInterpGrid, a yet undocumented feature of Grid.jl, does it this way).

If you want to try it out on my data, you can take a look at this gist - for lack of a better way to upload the data, I put it in a script which defines variables Rs, Zs and psis such that I want to be able to draw the contour curves of psis(Rs,Zs) with Rs on the x axis and Zs on the y axis. Whatever I try now, I seem to either get a transposed plot, or errors about a key missing (which disappear if I transpose psis in my call to contour_layers.

If you want to try it out on my data, you can take a look at this gist - for lack of a better way to upload the data, I put it in a script which defines variables Rs, Zs and psis such that I want to be able to draw the contour curves of psis(Rs,Zs) with Rs on the x axis and Zs on the y axis. Whatever I try now, I seem to either get a transposed plot, or errors about a key missing (which disappear if I transpose psis in my call to contour_layers. With correct output, the plot should be the transpose of this:

plot(Contour.contour_layers(Zs,Rs,psis,-85:5:100)...)

tranposed

tomasaschan commented 10 years ago

Also, things I'd wish for if/when this is incorporated in Gadfly:

And just because I tried a couple of more things, I finally figured out my problem with the transpose: contour_layers(Rs,Zs,psis',levels) does it. I'd still like us to move away from the convention that requires a transpose on Z.

tomasaschan commented 10 years ago

And, finally (I'll stop spamming now): please take a look at darwindarak/Gadfly.jl#1

darwindarak commented 10 years ago

The two dimensional z that I have been using have been arranged so that the (row,columns) correspond to (y,x). I was actually hoping that having the direction of you data line up with physical directions might reduce the confusion. I can add in an option to treat the data as if it was transposed. Which arrangement would you guys prefer as the default alignment?

  1. (r,c) -> (y,x)
  2. (r,c) -> (x,y)
tomasaschan commented 10 years ago

Since Julia is column major, having the first dimension along columns makes the most sense to me (i.e. (r,c) -> (x,y)). However, since both Python and Matlab do it differently, maybe I'm wrong =)

When discussing this for Grid, both @timholy and @simonbyrne provided good arguments for their stances.

simonbyrne commented 10 years ago

+1 for (r,c) -> (x,y).

darwindarak commented 10 years ago

@dcjones The contour tracing package is coming together and should be ready in a couple of days. However, I'm still having some difficulty implementing Stat.contour without a group aesthetic. Do you have any suggestions on how I should proceed? And if you haven't started on it yet, do you mind if I take a shot at adding the group aesthetic?

dcjones commented 10 years ago

Great news! The right way to add this is definitely with a group aesthetic. Since this is blocking contour plots, I'll make it the next thing I work on.

darwindarak commented 10 years ago

This is great! Thanks a lot @dcjones!

Sorry for dragging this issue on, but is there currently a way to plot contours of a matrix instead of a function? For example, the volcano dataset from RDatasets.

dcjones commented 10 years ago

I don't have a lot of experience with contour plots, so by all means let me know if this can be improved on.

Case in point, I completely forgot to support plotting matrices. I'll fix that shortly.

1oly commented 10 years ago

I'm really psyched about this! Would you mind showing a short usage example? @tlycken @dcjones @darwindarak ? Others are requesting this too =) https://groups.google.com/forum/#!topic/julia-users/yVKubPEoUYE

dcjones commented 10 years ago

Sure, the simplest usage is like:

plot((x, y) -> x^3 - y^3, -10, 10, -10, 10)

try3

You can control the number of contours and the number of samples on each axis with Stat.contour.

plot((x, y) -> x^3 - y^3, -20, 20, -20, 20, Stat.contour(samples=50, n=100))

try2

As mentioned, I still need to add support for plotting matrices directly. The other thing is there's not a version that will plot multiple functions, as in plot([f, g, h], 0, 10, 0, 10). I'm not sure if there should be or not. Maybe someone has an opinion.

Reopening so I remember that there's more work to do.

1oly commented 10 years ago

Thanks Daniel! Looking good. Tried a different function plot((x, y) -> x*exp(-x^2-y^2), -3, 3, -3, 3) but got an error:

no method minvalmaxval(Int64, Float64, Int64, Nothing, Nothing)
in apply_statistic_typed at C:\Users\s082312\.julia\v0.3\Gadfly\src\statistics.jl:657
in apply_statistic_typed at C:\Users\s082312\.julia\v0.3\Gadfly\src\statistics.jl:659

Changing to floats helped plot((x, y) -> x*exp(-x^2-y^2), -3.0, 3, -3.0, 3) but now some artifacts arise. myplot

Don't get me wrong, I really enjoy this new feature and I'm sure it will mature quickly. You're doing some excellent work here!

timholy commented 10 years ago

The method error was my fault. Fixed in 7623c6e517534844288283bc59f1c103d88ba79b. For the rest, others will have to comment.

tomasaschan commented 10 years ago

How does one plot contours for a matrix (i.e. an already sampled function)?

tomasaschan commented 10 years ago

Also, I found another issue: when plotting more complicated functions, which have several contour lines for the same level, they are connected by a line that shouldn't be there. It's somewhat non-trivial to define a function that makes this obvious, but I did manage to find an example that illustrates the point:

julia> plot((x,y) -> x*exp(-(x-int(x))^2-y^2), -8., 8, -2., 2)

esoteric

I haven't looked into this in any detail - if this is somehow due to the output from Contour.jl, please let me know and I'll fix the problem from there.

darwindarak commented 10 years ago

Plotting the output from Contour.jl individually with the same sampling as Gadfly (15 levels, 150 sample points in both directions) seems to work without the stray lines.

plot_order

darwindarak commented 10 years ago

Found it! The contour lines were grouped by color so different lines ended up being joined. Using the group aesthetic now gives:

grouped

I'll add a PR.

dcjones commented 10 years ago

Darwin added support for plotting contour plots from matrices, using the z aesthetic:

plot(z=M, Geom.contour)

It looks like we have the essential functionality now, is there anything else that's lacking?

darwindarak commented 10 years ago

I'll add the documentation tonight!

tomasaschan commented 10 years ago

@dcjones and @darwindarak, you guys rock my socks! =)

I suppose that also means I can get the x and y axes scaled correctly by doing something like plot(x=myx, y=myx, z=myz, Geom.contour)? What types should myx and myy be?

johansigfrids commented 10 years ago

If you do plot((x, y) -> x^2 + y^2, -10,10,-10,10) there is still some weird line connecting going on.

darwindarak commented 10 years ago

@johansigfrids Which version of Gadfly are you using?

darwindarak commented 10 years ago

@tlycken myx and myy should be the same type as z, which for now only takes FloatingPoint, but that's a limitation from Contour.jl.

johansigfrids commented 10 years ago

@darwindarak 0.3.3

dcjones commented 10 years ago

I haven't tagged a new version since the last few fixes. I'll do that soon, but if you want the changes now you can use Pkg.checkout("Gadfly") to track the latest commit in master.

johansigfrids commented 10 years ago

Ok, it works on the latest master. :)

tcovert commented 10 years ago

Is there a way to add labels directly to the contour lines? Something similar to the output of MATLAB's "contour", described here: http://www.mathworks.com/help/matlab/ref/contour.html

The color-coded legend is great for computer screens but might be challenging for print publication.

dcjones commented 10 years ago

There's not currently a way to label the contours directly. That would be very nice to have.

houshuang commented 9 years ago

This doesn't seem to work with layer? I am trying to plot a contour plot, and position a single point onto it.

This works

plot((x, y) -> sqerror(gen_function(x, y), df), -1, 2, -1, 2, Stat.contour(levels = 1000))

But this does not

plot(layer((x, y) -> sqerror(gen_function(x, y), df), -1, 2, -1, 2, Stat.contour(levels = 1000)),
     layer(x=[exp], y=[intercept]))

Complains that there is no method layer matching Function, Int, Int, Int, Int, ContourStatistics.

(just plot(x=[exp], y=[intercept]) works fine)

tomasaschan commented 9 years ago

That seems to be missing functionality - @darwindarak can probably fix it, but I happen to know he has a lot of other things on his hands at the moment so it may take a while.

In the meantime, try plotting a layer built from a matrix rather than from a function:

plot(layer(x=xgrid, y=ygrid, z=functionvalues, Geom.contour))

That works for me.

weymouth commented 9 years ago

You can get that to work for multiple contour plots? It doesn't work for me.

tomasaschan commented 9 years ago

@weymouth: No, that's true. I can only do it with a single contour layer at a time (and that bugs me too...).

tbreloff commented 8 years ago

Sorry to dredge up an old thread, but I wanted to see what's currently possible before I do too much work myself, in case there's been progress that I don't know about.

Is it currently possible to:

I plan on supporting all that with Plots... just don't want to duplicate any work if it's already working.

cc: @tlycken @darwindarak

tomasaschan commented 8 years ago

@tbreloff The underlying contour calculations are done by Contour.jl, which currently only supports (rectangularly) gridded data. I see no reason why it shouldn't work with sparse matrices, but it will probably be very inefficient - there are no optimizations in place for it, so the algorithm will index into each location - even the zero ones - at least once.

Regarding the other features, I don't think there's any work going on to support them at the moment, but @darwindarak probably knows better; I know I've seen requests for support of layering contours (earlier in this thread, for example) but I don't know if anything has actually happened.

Doing the actual plotting of these things is quite some way outside of my comfort zone, and thus I haven't invested any time whatsoever in it. I'm merely curating the Contours.jl package because packaging was a little outside of Darwin's at the time when Contour.jl was created :)

darwindarak commented 8 years ago

@tbreloff: I'm pretty sure most of the features you've listed are either not implemented yet or need a much better implementation.

tbreloff commented 8 years ago

Leave the plotting part to me ;)

In regards to sparse matrices, I was actually thinking that the missing values would be unknown (not zero), so the dense algorithm wouldn't work. I suppose the question is better phrased then as whether something like sparse NullableArrays would be supported. This is roughly the same problem as "contouring irregular data" though, so I'd probably solve that first and get the NullableArrays solution for free.

For filling the contours... I added a new feature yesterday to Plots for custom markers, and I was thinking I could just overlay a bunch of custom polygons with each contour section defining the custom shape of each marker. Then I'd be using Geom.shape (which is not yet part of Gadfly, but can be if requested) to plot the contours with or without fill.

I'm open to other ideas, and would love collaboration on a general algorithm to build contours from irregular data.