JuliaPlots / InspectDR.jl

Fast, interactive Julia/GTK+ plots (+Smith charts +Gtk widget +Cairo-only images)
MIT License
68 stars 9 forks source link

Comparisons #1

Open tbreloff opened 8 years ago

tbreloff commented 8 years ago

Hi @ma-laforge welcome to the plotting club. 😎 I'm curious what sort of work you focus on that something like GR didn't work for you. Have you done any benchmarks of speed/response? I'd love to hear your take on the current plotting selections.

I think it's great to solve real-time, time-based, interactive plotting. I'll keep an eye on the package, and maybe it might make sense at some point to think about adding it as a Plots backend. Best of luck!

ma-laforge commented 8 years ago

Hi @tbreloff,

Part of Plots: Cool :).

Targeted application As described on the README, I often need to look at simulation results, in order to understand circuit behaviour.

I find that being able to quickly/easily pan/zoom into your data is essential to figuring out what's going on during the design process. This helps in quickly isolating the condition where undesired behaviour is happening.

Did I do any benchmarks? Yes, but they were coarse. From my experiments, GR was definitively the fastest solution. Once the module is loaded in Julia, GR takes about 5sec to plot around 16 plots of minimal/moderate complextity, where most others take on the order of 40sec.

The reason for this slowness varies from tool-to-tool:

With my sample plots, I found InspectDR to have roughly the same performance of GR.jl... though a proper apples-to-apples experiment has not yet been set up.

Interactivity

More on load times Load times are annoying while writing new analyses because I often have to restart Julia.

I find pre-compiled modules don't help much here. I believe it is because many of the functions are not of concrete types... and so don't precompile until actually used.

Even Gtk/Cairo seems a bit slow to load up with the functions I use in InspectDR. I found the best solution is to not depend on too many modules.

Issues with GR Besides non-existent interactivity & current inability to display multiple plots...

Writing to PNG is slow, for some reason (much faster to write to SVG). This is very uncharacteristic of GR.

At the end of the day, I decided the current GR architecture to be impractical for me to write a GUI wrapper around GR in an attempt to make it more interactive.

MA

tbreloff commented 8 years ago

This is really great. Thanks for writing it up! I might link to this from the "backends" page of the Plots docs... Would you be ok with that?

On Friday, May 20, 2016, ma-laforge notifications@github.com wrote:

Hi @tbreloff https://github.com/tbreloff,

Part of Plots: Cool :).

Targeted application As described on the README, I often need to look at simulation results, in order to understand circuit behaviour.

I find that being able to quickly/easily pan/zoom into your data is essential to figuring out what's going on during the design process. This helps in quickly isolating the condition where undesired behaviour is happening.

Did I do any benchmarks? Yes, but they were coarse. From my experiments, GR was definitively the fastest solution. Once the module is loaded in Julia, GR takes about 5sec to plot around 16 plots of minimal/moderate complextity, where most others take on the order of 40sec.

The reason for this slowness varies from tool-to-tool:

-

PyPlot/PyCall: ~20s to load. Slow with moderate datasets (~200k points).

Qwt/PyCall (Raybaut): ~10s to load. Very responsive with moderate datasets (~200k points)... But Qwt is very slow when plots consist of many datasets (1k datasets used for eye diagrams) - even if the total number of points is not very large(~40k points).

Grace/xmgrace: ~5s to load (Not bad). Sadly, sending data as text is not practical for large datasets (would have to figure out netcdf interface).

Plots.jl/GR.jl: ~15s to load. Very fast once loaded, but only supports a single plot window at a time... and has pretty much zero-interactivity (ability for user to pan/zoom, etc.)

With my sample plots, I found InspectDR to have roughly the same performance of GR.jl... though a proper apples-to-apples experiment has not yet been set up.

Interactivity

-

PyPlot is somewhat ok for interactivity. Pan/zoom is somewhat responsive, but I find it breaks my concentration to have to click on icons to toggle from pan to zoom, etc. These operations are also a bit more laggy on my system than what I can achieve with InspectDR.

Qwt plots (wrapped by Pierre Raybaut) are much better for interactivity. Pan/zoom seems just as responsive as with InspectDR (middle button for pan/RMB for zoom)... and it does not require having to search for toolbar buttons. I have not yet figured out how to get nice-looking plots from Qwt (Saved images look more like a screen grab). I also find the mouse bindings a bit awkward to use... and I miss having a way to zoom out by two, or zoom out to full with a simple key press.

Grace/xmgrace plots are only interactive when compared to GR... but it can generate nice publication-quality plots!

More on load times Load times are annoying while writing new analyses because I often have to restart Julia.

I find pre-compiled modules don't help much here. I believe it is because many of the functions are not of concrete types... and so don't precompile until actually used.

Even Gtk/Cairo seems a bit slow to load up with the functions I use in InspectDR. I found the best solution is to not depend on too many modules.

Issues with GR Besides non-existent interactivity & current inability to display multiple plots...

Writing to PNG is slow, for some reason (much faster to write to SVG). This is very uncharacteristic of GR.

At the end of the day, I decided the current GR architecture to be impractical for me to write a GUI wrapper around GR in an attempt to make it more interactive.

MA

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/ma-laforge/InspectDR.jl/issues/1#issuecomment-220649000

ma-laforge commented 8 years ago

I have no problem with that, but also feel free to write it in your own words.

I like your section on backends - expecially the table "For the impatient". Choosing a plotting backend would have been more pleasant a few months ago if I had this information.

MA

timholy commented 8 years ago

Just FYI, GtkUtilities.jl contains generic panning and zooming facilities, and Immerse.jl offers some Gadfly-based interactivity. Seems like you already have what you want here, but if not, just FYI.

ma-laforge commented 8 years ago

Hi @timholy

GtkUtilities Yes... I noticed that just recently (after I made my own rubber-band/box zoom using right-mouse-button drag).

I have not yet implemented a "minpixels" on my box zoom... so I am contemplating switching over to GtkUtilities when I improve it.

Reservations I am always a bit hesitant to add dependencies to my modules (unless they are really necessary). As I was saying to @tbreloff, it appears to me that dependencies slow down load up times significantly - even when they are precompiled... If I am not mistaken: this is because Julia developpers are encouraged to write generic code, and only code with concrete types gets pre-compiled (unless someone takes the time to list which specializations are to be pre-compiled).

Gadfly I have been shying away from Gadfly in the past year or so due to speed issues. If I remember correctly, Gadfly was noticeably slower to plot than PyPlot - even once loaded.

After re-testing with a recent version of Immerse, I am not sure it is as bad as I remember it - at least not any more.

Immerse+InspectDR I suppose I could try to make InspectDR into a new backend for Immerse. The code for InspectDR is layered in such a way to make this plausible.

At the end of the day, I wanted to get InspectDR built up as quickly as possible, so I opted for a solution where I did not have to figure out how to integrate with Immerse.

Immerse+Gadfly I am a bit surprised. This solution appears faster than what I last remember. For small datasets, the main issues that remain are now compile time & load up time.

I like that you can double-click instead of scrambling for the "1" button every time you need to zoom out to full.

However, after using 3D tools like HFSS/EMPro/Sketchup/..., I don't understand why we can't use efficient/"standard" pan/zoom methodologies with 2D plots. Ex:

Immerse Bug? FYI: I was not able to go get my mousewheel to trigger any panning on my system - as described in the GtkUtilities page.

Immerse compile time Although really a 2nd order issue: It is somewhat annoying how long you have to wait to use Immerse after certain package updates.

Immerse load up time On my system, I get the following load times when I first include("test/gadfly_panzoom.jl"):

Comparing user experience Sadly though, when I bump up the number of points in gadfly_panzoom from 10^4 to 200_000, the user experience starts to degrade. The lag starts to be very noticeable when I zoom into data. I can only imagine what will happen whan datasets start to actually be large.

My (admittedly coarse) benchmarks indicate that InspectDR has comperable performance to GR.jl. I don't know if you have experience with GR - but I think @tbreloff can confirm that it is quite atypical! For a simple apples-to-apples comparison with InspectDR, you can use the following code:

using InspectDR

y = rand(200_000)
p = InspectDR.Plot2D()
add(p, collect(1:length(y)), y)

display(InspectDR.GtkDisplay(), p)

Regards,

MA

timholy commented 8 years ago

dependencies slow down load up times significantly - even when they are precompiled

Interesting. If one can actually put numbers on that, it could be quite an interesting discovery. It's a bit of a subtle question because there are really two issues: one is precompiling for specific types, and the other is caching the lowered code. For the latter, I can't think of a reason why splitting things into two packages should make it slower to load (unless there is a bug, which perhaps there is). For the former, there is indeed an issue: julia won't save machine-compiled code that spans modules. However, this should not affect load time, only latency to first execution.

I suppose I could try to make InspectDR into a new backend for Immerse...At the end of the day, I wanted to get InspectDR built up as quickly as possible, so I opted for a solution where I did not have to figure out how to integrate with Immerse.

I don't recommend trying to integrate with Immerse; Immerse is tightly interwoven with Gadfly, but current Gadfly is going through some growing pains due to apparent loss of maintainer. Unless the Gadfly situation changes, Immerse may have to change course someday or just rust away. So you may have been very wise to strike out on your own.

FYI: I was not able to go get my mousewheel to trigger any panning on my system - as described in the GtkUtilities page.

Hmm, that's strange. You zoomed in first, right?

Although really a 2nd order issue: It is somewhat annoying how long you have to wait to use Immerse after certain package updates. & slow performance on plotting

Yep. These are basically problems with Gadfly---Immerse is just an interactivity wrapper around Gadfly.

tbreloff commented 8 years ago

Tim: Can you expand on this comment (or point to existing docs/issues):

For the former, there is indeed an issue: julia won't save machine-compiled code that spans modules.

On Saturday, May 21, 2016, Tim Holy notifications@github.com wrote:

dependencies slow down load up times significantly - even when they are precompiled

Interesting. If one can actually put numbers on that, it could be quite an interesting discovery. It's a bit of a subtle question because there are really two issues: one is precompiling for specific types, and the other is caching the lowered code. For the latter, I can't think of a reason why splitting things into two packages should make it slower to load (unless there is a bug, which perhaps there is). For the former, there is indeed an issue: julia won't save machine-compiled code that spans modules. However, this should not affect load time, only latency to first execution.

I suppose I could try to make InspectDR into a new backend for Immerse...At the end of the day, I wanted to get InspectDR built up as quickly as possible, so I opted for a solution where I did not have to figure out how to integrate with Immerse.

I don't recommend trying to integrate with Immerse; Immerse is tightly interwoven with Gadfly, but current Gadfly is going through some growing pains due to apparent loss of maintainer. Unless the Gadfly situation changes, Immerse may have to change course someday or just rust away. So you may have been very wise to strike out on your own.

FYI: I was not able to go get my mousewheel to trigger any panning on my system - as described in the GtkUtilities page.

Hmm, that's strange. You zoomed in first, right?

Although really a 2nd order issue: It is somewhat annoying how long you have to wait to use Immerse after certain package updates. & slow performance on plotting

Yep. These are basically problems with Gadfly---Immerse is just an interactivity wrapper around Gadfly.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/ma-laforge/InspectDR.jl/issues/1#issuecomment-220802230

ma-laforge commented 8 years ago

there are really two issues: one is precompiling for specific types, and the other is caching the lowered code. For the latter, I can't think of a reason why splitting things into two packages should make it slower to load (unless there is a bug, which perhaps there is). For the former, there is indeed an issue: julia won't save machine-compiled code that spans modules.

Unfortunately, this is beyond my understanding of Julia's internals at this time. I do not think I can add much here.

However, this should not affect load time, only latency to first execution.

Agreed. Poor choice of words.

Hmm, that's strange. You zoomed in first, right?

Yes. Using Linux / Julia v0.4.2 / Immerse 0.0.11

Yep. These are basically problems with Gadfly---Immerse is just an interactivity wrapper around Gadfly.

I figured that might be the case.

timholy commented 8 years ago

@tbreloff, to clarify, I'm only talking about what happens if you have explicit precompile statements as part of your module code, or if you have defined an __init__ function that "exercises" some of the code in the module. If you have neither, then basically the cached module will not contain any machine code. It will still have lowered code (effectively a reduced form of the original source code), and that greatly speeds loading. But execution requires generating machine code. This is why Gadfly is relatively fast to load (compared to where it started before package precompilation) but still very slow to produce the first plot---it's that "generate machine code" (JITing) part that's slow.

Now, Gadfly happens to contain a file that issues a really large number of precompile statements (added by yours truly, via SnoopCompile.jl). This helps reduce the time to produce the first plot, by quite a lot (EDIT: as long as you're calling functions with the same types that were precompiled). The limitation is that, despite the large number of these statements, they are far from complete. Gadfly uses a lot of functions that are defined in Base (e.g., push!). Out of concerns about load time, precompiled files currently don't store machine code for functions defined in one module that are used by another. In other words, if there's a function gadfly_method that's specific to Gadfly, you can (with a suitable precompile statement) get the precompiled package to contain machine code for this function (for the particular input types you specify). But Gadfly.ji won't contain push! methods specialized for types that are defined in Gadfly, because those span modules and thus won't be cached.

Hope that helps clarify the situation. Of course, the joy of open source means that things could be changed, but this is how things work now.

tbreloff commented 8 years ago

Thanks for writing that out Tim... I certainly learned something today!

On Sunday, May 22, 2016, Tim Holy notifications@github.com wrote:

@tbreloff https://github.com/tbreloff, to clarify, I'm only talking about what happens if you have explicit precompile statements as part of your module code, or if you have defined an init function that "exercises" some of the code in the module. If you have neither, then basically the cached module will not contain any machine code. It will still have lowered code (effectively a reduced form of the original source code), and that greatly speeds loading. But execution requires generating machine code. This is why Gadfly is relatively fast to load (compared to where it started before package precompilation) but still very slow to produce the first plot---it's that "generate machine code" (JITing) part that's slow.

Now, Gadfly happens to contain a file that issues a really large number of precompile statements (added by yours truly, via SnoopCompile.jl). This helps reduce the time to produce the first plot, by quite a lot. The limitation is that, despite the large number of these statements, they are far from complete. Gadfly uses a lot of functions that are defined in Base (e.g., push!). Out of concerns about load time, precompiled files currently don't store machine code for functions defined in one module that are used by another. In other words, if there's a function gadfly_method that's specific to Gadfly, you can (with a suitable precompile statement) get the precompiled package to contain machine code for this function (for the particular input types you specify). But Gadfly.ji won't contain push! methods specialized for types that are defined in Gadfly, because those span modules and thus won't be cached.

Hope that helps clarify the situation. Of course, the joy of open source means that things could be changed, but this is how things work now.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/ma-laforge/InspectDR.jl/issues/1#issuecomment-220815058

ma-laforge commented 8 years ago

Thanks for the crash course. I will try to comment now.

If you have neither [precompile statements or __init__ function], then basically the cached module will not contain any machine code. It will still have lowered code (effectively a reduced form of the original source code), and that greatly speeds loading.

Really? Not even when the types are concrete? That might affect how I code in the future.

[...] SnoopCompile.jl

I will have to learn how to use this soon.

But Gadfly.ji won't contain push! methods specialized for types that are defined in Gadfly, because those span modules and thus won't be cached.

Could this issue not be circumvented by defining a local function _push!?, then creating an alias:

function _push!([ARGLIST_i])
#Demanding algorithm would benefit from precompilation.
end

Base.push!([ARGLIST_i]) = _push!([ARGLIST_i])

This potentially adds a level of indirection in Julia, but should be acceptable for a large class of applications.

As for the original discussion

For the latter [caching the lowered code], I can't think of a reason why splitting things into two packages should make it slower to load

Agreed (at least not a direct reason). Rather, I believe what happens is that packages builders intrinsically try to provide generic functions (which is good) - and might not want to precompile too many specializations so as to not hinder load times.

This leads the module user (client module) without machine code for its required specializations... unless there is a way for the client module to induce precompilation of the functions from the "service module".

Ok, but you still have to compile that code if it is in your own module...

Sure, but you often build much simpler structures, and Julia has to pick from much fewer function signatures... not to mention it no longer has to resolve all the indirect function calls (which can only be determined once the types are known).

Another observation As far as I can see, the precompilation process is significantly slower than Julia's default "on-the-fly" compilation.

That means that when I am building "complex" software composed of multiple modules, precompilation significantly slows down the development process: Editing one of the base (i.e. "service") modules triggers a very slow chain of pre-compilation. Testing moves alot faster when precompilation is disabled on all modules under development... but then I forget to re-enable precompilation.

timholy commented 8 years ago

Really? Not even when the types are concrete?

Right. It won't generate native code unless you make it. You can check this with

__precompile__()

module CC

bar(x::Float64) = 2x

precompile(bar, (Float64,))

end

and then look at the *.ji file with a text dump (e.g., less). You'll only see machine code (look for mul float) if you have that precompile statement in it---comment that line out and you'll see the machine code disappear. Declaring the type of x in bar doesn't change that at all; there is no benefit whatsoever to making your code less generic.

Demanding algorithm would benefit from precompilation.

What do you mean by "demanding"? The algorithm won't run any faster; the only issue is whether you have to wait for it to compile the first time you use it. The second time you use it, it will be the same whether it was precompiled or not.

If your local _push! just calls push!, there won't be much benefit to defining it.

As far as I can see, the precompilation process is significantly slower than Julia's default "on-the-fly" compilation.

Yes. On julia-0.5, you can run it as julia --compilecache=no to prevent it from compiling things while you're developing.

ma-laforge commented 8 years ago

Right. It won't generate native code unless you make it. [...]

Excellent. Thanks for showing me this.

I wonder: why doesn't Julia automatically precompile functions with concrete signatures? Is the idea to anticipate people leaving dead code in their modules?

What do you mean by "demanding"? [...]

I meant any function that will take noticeable time to compile downto machine code... as opposed to the trivial compilation of push![ARGLIST_i]) = _push!([ARGLIST_i]) that could quickly be performed @ first execution. (Hopefully Julia can even optimize away the addition to the call stack imposed by this indirection).

I must admit, though: Many implementations of push! might not require much more machine code than required to call push! on a subfield. That being said, doesn't Julia's unwillingness to precompile push! inhibit the pre-comilation of functions that use this push!? Hmmm... I might have to do more tests to learn more...

Yes. On julia-0.5, you can run it as julia --compilecache=no to prevent it from compiling things while you're developing.

Excellent. That is good to know.

skanskan commented 7 years ago

How does it compare to GLVisualize?

ma-laforge commented 7 years ago

When I first developped InspectDR, I might have been unaware that GLVisualize existed (but I believe I was unable to get it running at that time).

Here are the differences as far as I can tell right now:

Caveat: I should probably give GLVisualize another try some time soon. I am not fully aware of all of its capabilities.