JuliaPlots / Plots.jl

Powerful convenience for Julia visualizations and data analysis
https://docs.juliaplots.org
Other
1.84k stars 355 forks source link

Reorganizing optional package loading #918

Closed mkborregaard closed 4 years ago

mkborregaard commented 7 years ago

The two main issues plagueing Plots are, and have been for a long time, 1) slow loading time and time-to-first-plot, and 2) instabilities (world age, precompilation, the "Media" issue) due to conditional package loading. (There is third, lesser, issue about the big git history that makes this incompatible with e.g. JuliaPro at the moment, but that we hope to resolve with Pkg3). Addressing these two must be on our roadmap.

This issue is to discuss how to deal with the second, but the solution could also markedly improve the first. In a recent discourse thread (https://discourse.julialang.org/t/optional-dependencies-requires-jl/3294) involving some of the people who know a lot about Julia's package system, there were a number of different suggestions, that nicely sum up the various ideas that have been on the table.

  1. Use Requires.jl to conditionally load packages, instead of the current @eval structure. Doing this for Juno seems to work for the Media issue https://github.com/JuliaPlots/Plots.jl/pull/916 (but it doesn't work at the moment because of the world-age issue). @pfitzseb is looking into the scope for extending this to backend loading.
  2. Reorganise so all backend-specific code is in it's own module that depends on the backend. @tbreloff initiated a reorganization of Plots along these lines (found on the reorg branch) and there exists a PlotsGR package to test this out. Something like this was also suggested here (e.g. https://discourse.julialang.org/t/optional-dependencies-requires-jl/3294/20 )

There were also a number of other ideas, e.g. https://discourse.julialang.org/t/optional-dependencies-requires-jl/3294/9 and https://discourse.julialang.org/t/optional-dependencies-requires-jl/3294/24 , though I'm not completely sure how those would solve the issue.

Finally, @vchuravy has made a follow-up PR on the discourse discussion https://github.com/JuliaLang/julia/pull/21743 which might deliver the infrastructure we need in 0.7 or 1.0.

I think resolving this should be a high priority.

cc @tbreloff @pkofod @jheinen @daschw @Evizero @ChrisRackauckas @pfitzseb

ChrisRackauckas commented 7 years ago

I think this old discussion with @tbreloff about lazy loading is worth bringing up:

https://github.com/JuliaLang/Juleps/issues/2#issuecomment-257608059

Requires is a great solution that I think might be able to handle the lazy loading too. If backend initialization code (https://github.com/JuliaPlots/Plots.jl/blob/master/src/backends/gr.jl#L53) is in an @require block, it shouldn't be loaded until the user does using X. I think you'd just need to change

https://github.com/JuliaPlots/Plots.jl/blob/master/src/backends.jl#L20

to

# package_name(:gr) == :GR
$sym(; kw...) = (default(; kw...); @eval(:(import $(package_name(sym))); backend(Symbol($str)))

so that way it's essentially doing

using Plots
import GR

plus the backend setup. That should make most of Plots.jl precompile just fine and it should be pretty fast? I think you'll need to make a dummy library for Plots.jl's internal Plotly though, since it's not the same as Plotly.jl. Maybe Requires can act on the importing of submodules (@MikeInnes ?)?

vchuravy commented 7 years ago

The Requires.jl approach is not quite compatible with precompilation since it is dynamic and on load time. Requires.jl can only act on top-level modules as of now and that is a limitation on how the loading hooks are implemented in Base.

I hope that I can finish the conditional modules infrastructure for 0.7, but I have no time to polish that due to JuliaCon.

MikeInnes commented 7 years ago

@vchuravy is 100% right but it's also worth pointing out that a lot of precompile is parsing. For dynamic code the type inference won't buy you much, so Requires isn't likely to add a huge overhead vs full precompilation.

OTOH, I haven't generally used Requires for more backend-like integrations (Like Flux/TensorFlow), because if the user explicitly sets the active backend you can just load the required code then. If you need to do using and then set the backend, that's just an extra step.

mkborregaard commented 7 years ago

I just saw @StefanKarpinski 's talk on Pkg3, and it seems to me he said that conditional modules will not be coming, in the sense we need them here. This would seem to point to the reorganisation numbered "2" above. Any thoughts on this?

mpastell commented 7 years ago

I vote for option 2. I think it would solve a lot problems and wouldn't affect usability at least for me. It would also possibly make it more clear that some of the backends are less mature (e.g. maybe not tagged at all).

piever commented 7 years ago

Personally I'm in favor of option 2. I think the most straightforward solution is to have a package PlotsGR which extends some methods of Plots in order to display a plot that was created with Plots.

The key question, in my view, is understanding how to split duties between the Plots and PlotsGR package. As of now, I think the situation is not ideal. IIUC, what happens is that Plots creates a plot object (which is basically an Array of Dicts with all the different attributes, one Dict per series), transforming a high-level command from the user into very specific, low-level details of how the plot should be drawn (ticks, margins, colors, line-widths and so on...). The backend code then has a monolithic display function that implements all of these attributes and outputs a plot. For example, here is gr_display, where, in about 600 lines, it is implemented how to display a subplot.

This approach has the following issues:

What I believe is an extra source of confusion is the fact that not all display function are implemented in the same way. For example, plots could either be supported natively or via a recipe depending on the backend. However, if the plot is not completely low level (say draw a line or a shape with given coordinates) backend implementations may differ. Bar plots are probably a good example of this problem, as there is a recipe to draw a bar plot as a shape plot, but PlotlyJS just uses native bar plots (leading to this kind of issues).

What could potentially help would be the following:

We could choose one example backend, for example GR, and decompose the gr_display function into a large number of small functions which are, little by little, editing the plot that is being created. We should rely as little as possible on native series support, for example let's not assume that something is supported if it can be decomposed further (for example bar plot). In that I believe GR is pretty minimalistic and we should go in this direction, IIUC it only supports natively:

    :path, :scatter,
    :heatmap, :pie, :image,
    :contour, :path3d, :scatter3d, :surface, :wireframe,
    :shape

All of this logic would then live into Plots.jl, where one would have something like:

function display(sp::Subplot, w, h, viewport_canvas)
    set_font!(sp, ...)
    draw_grid!(sp, ...)
    ...
end

These small functions (in the example way of coding display, set_font! and draw_grid!) would then have specialized methods defined in PlotsGR:

function set_font!(sp::Subplot{GRBackend}, ...)
    ...
end

and would represent the "backend interface". Then, to update an existing plot, one could call one of these methods rather than displaying the whole thing. To develop a new backend, one would only need to implement all of these small functions.

The drawback of this plan is that it is a lot of work, so maybe we could start some PlotsBase + PlotsGR repos (while leaving Plots alone) and see how far that gets.

SimonDanisch commented 7 years ago

@piever all you say sums up a lot of the problems I had Plots.jl.

The drawback of this plan is that it is a lot of work, so maybe we could start some PlotsBase + PlotsGR repos (while leaving Plots alone) and see how far that gets.

Well, I've done quite a bit of that work in MakiE already and would love to have more help and use this is a prototype to see how we can structure Plots.jl in the future!

Let me know how I can get people started! Now that quite a lot of the basic functionality is working, I think MakiE is ripe for another thorough documentation step (especially for the internals)!

piever commented 7 years ago

I'm also happy to try out (and contribute to) this reorg idea in MakiE.jl. Things that would help me the most are:

The last point is basically what I was mentioning above. I believe it's crucial to have a backend independent representation of a plot object (in Plots.jl it is an array of dictionaries, I believe we'd want a richer structure if some attributes have to be linked to others): figuring out this representation would in my view be a major step forward.

The next step would be figuring out how to structure a display function that goes hand in hand with the plot representation (i.e. where the "isolated components" of the plot have corresponding small update functions) in an abstract way, independent of GLVisualize. Finally, one would add the GLVisualize implementation of the "interface" functions used to display the plot.

With this refactor I think MakiE would get much closer to the conceptual organization of Plots and it'd be much easier for Plots contributors to also contribute to MakiE.

ChrisRackauckas commented 7 years ago

What I've been recommending is that:

  1. each backend gets a type for which there is a Backend supertype in Plots
  2. any function that is currently overwritten instead gets a dispatch on this backend type
  3. On the top level functions, this should be an optional type on the end of every interface function which defaults to a global CURRENT_BACKEND.
  4. The functions which have this dispatching should be documented as the formal interface.
  5. As @piever suggests, internally used types should just be prototypes, so that there's a SubPlot supertype that's used in dispatch, but this lets GR define is own GRSubplot that is a backend-specific (and type-stable) store of its data.

This will all compile well and lets things run much smoother than dictionaries, yet it should make it easy for Plots.jl to specify logic at a high level and let the packages handle the specifics.

SimonDanisch commented 7 years ago

This will all compile well and lets things run much smoother than dictionaries, yet it should make it easy for Plots.jl to specify logic at a high level and let the packages handle the specifics.

I'd like to differ... I prototyped a completely typed pipeline in Visualize.jl and I got terrible compile times, quite a bit of implementation complexity - all with almost no performance improvements. Performance critical functionality happens somewhere else anyways, so I like to wait for Julia's static compilation and jit performance to mature...

What needs to happen though is a proper definition of what attributes are expected, how they get transformed, where default values come from and a definition of invalid attributes - which is, as I remember it, a mess in Plots.jl right now.

I will write more in the documentations about how I plan to structure things and the integration with Plots.jl! I have quite a few ideas about this and quite a bit is already implemented :)

jheinen commented 7 years ago

The drawback of this plan is that it is a lot of work, so maybe we could start some PlotsBase + PlotsGR repos (while leaving Plots alone) and see how far that gets.

Platform-, language- and device independent graphics is always "a lot of work". We will contribute where we can, but we are still in the process of improving the deployment of GR for both Julia and Python, which IMO is the hardest job.

ChrisRackauckas commented 7 years ago

@simon if you know the common type signatures you can force them to precompile though.

SimonDanisch commented 7 years ago

@ChrisRackauckas yeah kind of, but the other points are still true as well. And precompilation doesn't actually do much, so unless we cache the binary representation, we will still have huge jit overheads. Also, it's pretty hard to not start off type unstable (because the user interface relies on kw_args and if not you get a lot of type permutations (e.g. with plot(x, (:attribute => a, :attribute2 => c)), note that this won't actually be type stable right now, since the getindex return type can't be inferred from the key type. And if we actually use concrete types, you will still need kw_args. Have a look at FieldTraits for a fully type stable version of this).

Because of that I don't see this as a high priority anymore...
It seems perfectly fine to rely on Dicts for construction, it just seems important to use a type barrier for processing operations and the backend should use fully typed functions in the rendering code. Which is the case in the design of MakiE!

ChrisRackauckas commented 7 years ago

We could call a test suite of sorts in the top level of the module to force it to precompile a lot more.

Of course it won't be type stable at the top level call either. I even recommended using a default global there. That doesn't matter. It's in the repeated calls further down. That's what I was advocating and it seems we agree except for type vs dict in the construction. I find the code in plots building a big dict harder to read anyways since there is no layout describing what should be in there. The fields could be left untyped or only typed when obvious, but I think structuring it into types has this readability advantage as well.

SimonDanisch commented 7 years ago

since there is no layout describing what should be in there

That's exactly what I agree with! And I think I found a relatively nice way to work around that!

SimonDanisch commented 7 years ago

I basically created a macro, that defines for a specific plot what an attribute should get converted to, what attributes are allowed and it does proper error handling for invalid attributes. It also documents all attributes semi automatically!

timholy commented 7 years ago

We could call a test suite of sorts in the top level of the module to force it to precompile a lot more.

You can automatically generate precompile statements with SnoopCompile. But the first sentence of https://github.com/timholy/SnoopCompile.jl#userimgjl is quite important: what gets saved to *.ji files is a pretty small fraction of the compilation work: it doesn't store the machine code (i.e., the output of LLVM) only the result of inference. Moreover, IIUC a signature like push!(::Vector{RGB{N0f8}}, ::RGB{N0f8}) would not be stored in precompiled form in any *.ji file (even if you explicitly precompile it) because it involves 3 different modules (FixedPointNumbers, ColorTypes, and Base). This is not a SnoopCompile limitation: this is just how precompilation currently works.

mkborregaard commented 7 years ago

This is an interesting discussion, and it will be interesting to see the improvements @SimonDanisch is making and whether they could be applied in Plots. But I'll say I like @ChrisRackauckas suggestion a lot.

rafaqz commented 7 years ago

To add some feedback to this discussion, I gave a talk on julia to my department (biology/ecology) yesterday - who largely use R and C. The startup time for Plots and plots backends (and probably the general perception that gave of the julia) was singled out as the major drawback to switching to julia for R converts. Using RCall plot() from julia in the same demo was incomparably faster and really rubbed it in. I think fixing this critical for general use of julia for R users.

SimonDanisch commented 7 years ago

I can totally understand that! If I wasn't already invested in Julia this would also be a deal breaker to me :P After all, I'm the kind of person that once switched from firefox to chrome because it started 0.5s faster :D

...which is why I'm trying to get static-julia working with MakiE every other day.

timholy commented 7 years ago

One solution that is pretty close to working today is making Revise as good as it can be and keeping your Julia session open a long time (say, a week). At that point startup and JIT-compilation become basically irrelevant.

Revise can't really work properly on 0.6, but as of yesterday 0.7 seems to be mostly "Revise-ready." Need to merge https://github.com/timholy/Revise.jl/pull/49 and then it should work for basically any package (if it doesn't, it's a bug). Method deletion is also very, very close to working. That will leave changes to type definitions as the one reason you truly have to restart a Julia session. (Macros and generated functions might require some manual revise(mymodule) calls.) Since I find that I typically change my type definitions rarely, I think this is actually a pretty viable strategy. Revise even tracks Pkg.update() changes in a running session (we might have to get rid of that warning...).

SimonDanisch commented 7 years ago

I already work like that with Atom. And while it's nice I still need to restart Julia more often then I'd like to (for whatever reason) - and every time really hurts! But even with just one start, on my weak laptop it's still insane with compilation times to the first image ranging from 5 - 10 minutes!

timholy commented 7 years ago

Atom/Juno is an amazing solution, but AFAICT it isn't as low-level as Revise. Just try git stash and running your tests again, with debugging statements inserted, before you made that change that broke something. Or do a Pkg.update() to see if that change someone pushed last night to one of your dependencies fixes your problem. I'm quite serious when I say I'm shooting for keeping Julia sessions open for an entire week, and for that to happen I think you need to be able to track changes that come from any source, not just those that you typed into an IDE.

Overall I've started thinking about it this way: in C, your development cost is O(N) where N is the number of lines of code:

In julia, the numerical factor is much smaller (Julia is a much more efficient language to write code in than C), but unfortunately it may be O(N^2) in terms of its scaling: if you don't use something like Atom/Juno or Revise, then every line of code you change forces you to restart and test again, and each one of those cycles is O(N) because all of the code needs to be JIT-compiled. static-julia doesn't fix that (in fact it makes it worse, because the build time is longer), but Revise makes julia development truly O(N).

If you haven't tried it yet (perhaps because of current flaws in Revise), for big projects like GLVisualize/MakiE I can't even begin to describe how much nicer it is. For my own use I am just about to throw 0.6 under the bus because Revise should be so much more robust on 0.7. Of course, there are then a lot of packages I will need to fix before I can get real work done.

SimonDanisch commented 7 years ago

Well, static-julia is for shipping a package that you don't touch anymore - and I also want to get it going for offering Julia packages to other languages without having them to wait for the JIT.

Maybe it's my fault, but I sadly have a pretty high fatal error rate (especially with OpenCL/CUDA/OpenGL - also stackoverflows sometimes terminate julia) and I also change my types relatively often when prototyping stuff... OpenGL also forces me to restart because of some messed up state of the horrible opengl state machine.

Don't get me wrong, I think revise is a bliss for developers and I'm pretty sure I will start using it!

But we will also need to target another audience at some point: people that will never touch package code and want to run Pkg.update as infrequent as possible.

They just want to write their self contained scripts calling package code and restart Julia as they please ;)

timholy commented 7 years ago

Oh yes, for users more static compilation is the answer. Definitely :+1: :+1: for that.

pfitzseb commented 7 years ago

Atom/Juno is an amazing solution, but AFAICT it isn't as low-level as Revise. Just try git stash and running your tests again, with debugging statements inserted, before you made that change that broke something. Or do a Pkg.update() to see if that change someone pushed last night to one of your dependencies fixes your problem. I'm quite serious when I say I'm shooting for keeping Julia sessions open for an entire week, and for that to happen I think you need to be able to track changes that come from any source, not just those that you typed into an IDE.

In principle there's no difference in what you can do in Juno compared to Revise, afaict, but in Juno it's much more annoying to keep track of unlocalized changes like those from Pkg.update(). I certainly intend to look into making Revise.jl play nice with the Juno workflow sometime in the future :)

timholy commented 7 years ago

Right, once you decide you want to track changes that aren't entered into an editor, it just becomes a "library" operation that you trigger based on file modifications. No reason that logic can't live in Juno, but given that it no longer needs to be coupled to a particular editor, I think it makes sense to split it out.

Revise does check for Juno, and when I checked it seemed to work, but I haven't tested recently. (Once Gallium is working again I'm going to become a Juno user. I really want it :smile:.) Some day perhaps it would make sense to have Revise handle much of the state-tracking, but on 0.6 I wouldn't recommend making that kind of change, because Revise is far too reliant on dirty tricks. Hopefully this might be more sensible on 0.7.

MikeInnes commented 7 years ago

For large-scale changes across a module you can just C-shift-enter in the module's main file to get the same effect. Not automatic, but very robust and has been working since 0.3.

I don't think I've ever wanted to have a Pkg.update() take effect in a Julia session, but if it turns out to not be horribly brittle we could consider including that in Juno.

timholy commented 7 years ago

Somehow in my experiments with Juno I never discovered that. :+1: But if it worked in 0.3, does that mean it did/does it by module replacement? If so, how does that handle a dependency chain ImageSegmentation->Images->ImageFiltering->ImageCore->Colors->ColorTypes->FixedPointNumbers when I use the C-shift-enter trick on FixedPointNumbers?

Revise's strategy is admittedly more risky because it keeps track of quite a lot of state. There are benefits, though. My favorite is that hopefully by tomorrow you'll be able (on a branch, 0.7-only) to handle method deletion. (oops, didn't want to dispatch specially on Int, can I get rid of that method now?) Will be interesting to see whether the benefits are worth the risks. Fragility at the level of editing code is not something you want.

MikeInnes commented 7 years ago

Right; normal eval will work in that case, but if you do a git-stash and you want to see how that affects packages that depend on it, you'd have to re-eval those dependent packages as well. Needing this with such a deep dependency tree seems like it will be unusual for most folks, but I can certainly see how it would come in handy.

Revise definitely has the potential to be more general in that sense; method deletion is awesome and working across your whole package setup could be extremely powerful. As you've found, this is tricky to do well, and I'm a bit wary of the many traps involved when automatically executing code. But if Revise can push the power/robustness pareto front forward that could become something very cool.

timholy commented 7 years ago

Love how random discussions in the Julia community bring up Pareto fronts.

mkborregaard commented 7 years ago

Hi friends, thanks for all the comments in this thread. While you're here, could you also comment on how we get Plots refactored as well as possible to not have long precompile times (the topic of this issue)? I've been not been doing much on it as I'm too busy this week

mkborregaard commented 7 years ago

It's THE most important issue for Plots. @SimonDanisch can you say - are you going to bring in some changes from MakiE that will address this, or should we think of the projects as independent and move ahead?

mkborregaard commented 7 years ago

WRT revise I have a branch that that I keep rebased on master that ensures that files that are @evaled also gets registered with Revise. Maybe we should just merge that. I just keep julia open and use Revise, and it works great IMHO.

mkborregaard commented 7 years ago

(I didn't mean my remark about staying on topic to sound grumpy - just that I think the topic of this issue is really important)

mkborregaard commented 7 years ago

OK it did sound grumpy (apologies @pfitzseb @MikeInnes @timholy @SimonDanisch ) - here's the rub:

I agree with the need to refactor Plots to work better with recompilation. But to be honest I don't know the compilation - module - methods table mechanics of Julia well enough to see the best solution.

The question is - who should do it? We can talk about the design, but in the end, if noone volunteers to do this, it's going to land on me. So - I've been secretly hoping that enough information would turn up in this thread to allow me to implement the design. It's not there yet :-)

An even more secret hope is that someone else would take the responsibility for doing it. In my ideal world, someone like @SimonDanisch would take the really exciting work he's done for MakiE and use it to restructure Plots. But, and apologies for this, Simon, it's not clear for me whether you intend MakiE to empower/supplement Plots, or compete with it? I can understand if you'd rather look to the future than dealing with a big existing codebase. I do think, though, that the recipes and engine and backend-compatibility of Plots, with some novelty that allows for better precompilation (ideally static compilation) and interactivity, has the potential to become super-awesome (it already is awesome IMHO - but even awesomer). My only obligation here is for all the contributors to Plots over the last years that their contributions remain active and viable.

So in conclusion - I hope someone else will take responsibility for the reorg – if I have to do it everyone should know it is in fact over my head :-)

ChrisRackauckas commented 7 years ago

I want to help since I know dispatching and how to do conditional import stuff like this well, but I have no idea what the interface is for the backends. Can we get that specified?

piever commented 7 years ago

@ChrisRackauckas To be entirely honest there is not much of an interface as of now. Unless I'm missing something, if you look at, for example, the GR backend code, you'll see that a very small number of functions from Plots are overloaded there:

So basically Plots builds a Plots.Plot object and feeds it to the backend and the backend displays it. One solid initial step would be (in my view) to create a glue package with this interface which I'm assuming would help with the precompilation problem (well, we'd probably need to think about Juno and IJulia as there is some sort of conditional dependency there as well). Then one can think about formalizing the interface / allowing faster and more efficient plot updates.

I understand @mkborregaard concerns as:

So my view is that it'd be ideal to actually exploit @SimonDanisch excellent work on MakiE and use it as a new starting point (especially if it will have backward compatibility for recipes), but if that's not possible it'd be good to start moving around the code in a way to solve, if not all the interactivity issues that MakiE solves brilliantly, at least the precompilation problem.

MikeInnes commented 7 years ago

Requires won't be a good solution for this problem; indeed there's no canned solution for this since it's come up relatively rarely in the Julia package world. I don't think there's a straight forward way to get all of full precompilation, safety to Pkg changes and low user burden at once.

My suggestion is that you look for and register available backends during Pkg.build(). Then during precompilation you can behave as if the available backends were hard dependencies and bake that code into cache. (You don't need lots of submodules for this.)

Main downside is that Pkg changes will require a Pkg.build("Plots"). If you Pkg.rm() a backend then it will fail (you can probably wrap the errors, and for naive users this is less common anyway). Likewise, if you Pkg.add() a backend Plots won't know about it, but again you can wrap the errors and give appropriate feedback on how to resolve this.

Just one idea, but hopefully that helps and makes up for the derailing :)

ChrisRackauckas commented 7 years ago

My suggestion is that you look for and register available backends during Pkg.build(). Then during precompilation you can behave as if the available backends were hard dependencies and bake that code into cache. (You don't need lots of submodules for this.)

This then gets rid of the lazy loading, but has the downside that you can have incompatibilities between binaries which is what @tbreloff mentions was the original reason for doing it:

https://github.com/JuliaLang/Juleps/issues/2

But separating to a different package and using an overload + optional arg global backend choice will compile correctly and lazy load (err... load on using PlotsGR), with only a little bit of user-burden. I don't think it's bad at all?

timholy commented 7 years ago

(I didn't mean my remark about staying on topic to sound grumpy - just that I think the topic of this issue is really important)

Didn't sound grumpy to me. It's just that I'm not sure there's a good solution to the compile-time problem right now. I brought up the whole Revise thing because I think that keeping the same session running is IMO the best answer we have now, and likely to stay that way until at least Julia 1.1.

To provide a little more detail: I suspect that the best hope for reducing time-to-first-plot would be through a well-chosen set of precompile statements. But these have pretty serious limitations: for it to make a difference, IIUC you have to "own" all the types and functions that are arguments to that call in your package. So saying precompile(push!, Tuple{Vector{Foo}, Foo}) won't help even if your package defines Foo, because Base owns both push! and Vector. I could be wrong about this, but that's my current understanding. So precompilation is great for decreasing load time through reducing the amount of parsing that's needed, but it generally doesn't help with JIT-compilation.

With regards to optional dependencies, other people here have many more insights than I do, so I won't even venture a comment. Looking back again at the OP, I see that's your main concern in this issue, so apologies for derailing it.

timholy commented 7 years ago

Oh, one thought (sorry to still be at "reduce time-to-first-plot"): https://discourse.julialang.org/t/does-compile-time-depend-on-type-stability/6548/5?u=tim.holy

MikeInnes commented 7 years ago

you can have incompatibilities between binaries which is what @tbreloff mentions was the original reason for doing it

Honestly that seems like it counts as a bug in those packages to me. If loading a certain combination of packages breaks things then issues are inevitable whether or not Plots works around it now.

ChrisRackauckas commented 6 years ago

The issue seems to run pretty deep. For me, normal is:

julia> tic(); using Plots; toc()
elapsed time: 5.966643648 seconds
5.966643648

julia> tic(); p = plot(rand(10,10)); toc()
elapsed time: 9.649117342 seconds
9.649117342

julia> tic(); display(p); toc()
elapsed time: 5.783277532 seconds
5.783277532

adding gr() to the bottom of the module does:

julia> tic(); using Plots; toc()
WARNING: using Plots.GR in module Main conflicts with anexisting identifier.
elapsed time: 9.443139341 seconds
9.443139341

julia> tic(); p = plot(rand(10,10)); toc()
elapsed time: 9.0992447 seconds
9.0992447

julia> tic(); display(p); toc()
elapsed time: 4.139666058 seconds
4.139666058

adding gr(); plot(rand(10,10)) increases using time a lot but still doesn't really effect it. It seems that what @timholy is saying is causing a huge problem here. Unless precompilation drastically changes, I am not sure if this'll do well without a full PackageCompiler.jl approach, but that seems fundamentally incompatible with how Plots.jl works since it requires things to be inferred and not have globals. Plots.jl's uninferred dictionary approach just won't work with that.

t;dr: I don't think a re-org would give the change that I hoped, and a full-blown change (Makie.jl) is required unless there's a big upstream change to the way precompilation works.

jandehaan commented 6 years ago

I found a way to lazily load a submodule on demand

In file /MyPackage/src/MyPackage.jl

__precompile__(true)
module MyPackage
...
# Allow Julia to find modules in the extensions/src directory  (or any other directory) 
function __init__()
    local path  = joinpath(Pkg.dir(), "MyPackage", "extensions", "src")
    path in LOAD_PATH || push!(LOAD_PATH, path)
end
...
# No reference to MyPackageSubA whatsoever
...
end #module

In file /MyPackage/extensions/src/MyPackageSubA.jl

__precompile__(true)
module MyPackageSubA
using MyPackage
...
export foo
foo() = println("Module MyPackageSubA loaded successfully.")
...
end #module
julia> using MyPackage
julia> foo()
UndefVarError: foo not defined

julia> using MyPackageSubA
julia> foo()
Module MyPackageSubA loaded successfully.

Julia currently cannot handle using MyPackage.SubA. It doesn't like the dot. It would be nice if Julia would do the following when it encounters using MyPackage.SubA

if !isdefined(Main.MyPackage) 
    using MyPackage
end

let
    local path  = joinpath(Pkg.dir(), "MyPackage", "extensions", "src")
    path in LOAD_PATH || push!(LOAD_PATH, path)
end

using SubA
# using MyPackage.SubA would be nicer, if this means that I can 
# refer to unexported variables in SubA as follows
MyPackage.SubA.bar

So far I haven't encountered any gotcha's yet.

ChrisRackauckas commented 6 years ago

Does that reduce the startup time? I'm not sure that can be AOT compiled if it has to change global load paths.

jandehaan commented 6 years ago

@ChrisRackauckas I made the following changes to my code above (changed the let statement to function __init__() and added __precompile__(true) to both modules). Both modules precompile fine and the function foo works as expected. It may be worth trying this approach with more realistic code.

jheinen commented 6 years ago

I don't really understand the time to first plot results:

@time using Plots
@time pyplot(show=true)
@time plot(rand(10,10))
  6.816656 seconds (6.19 M allocations: 349.942 MiB, 3.78% gc time)
  6.892494 seconds (4.57 M allocations: 245.328 MiB, 1.28% gc time)
 14.941858 seconds (9.96 M allocations: 526.248 MiB, 1.28% gc time)

vs.

@time using Plots
@time gr(show=true)
@time plot(rand(10,10))
  6.827028 seconds (6.19 M allocations: 349.926 MiB, 3.87% gc time)
  1.760695 seconds (1.76 M allocations: 93.075 MiB, 1.39% gc time)
 12.434105 seconds (10.41 M allocations: 537.749 MiB, 1.55% gc time)

Here are the "pure" GR results:

@time using GR
@time plot(rand(10,10))
  0.073958 seconds (34.38 k allocations: 2.758 MiB)
  2.986878 seconds (2.32 M allocations: 123.543 MiB, 7.26% gc time)
ChrisRackauckas commented 6 years ago

That's what I am saying is probably the pipeline, since that's really the one thing introduced in there. Even when it's not doing much it still has to send dictionaries through it. We would want to AOT compile that on most types, which isn't possible with something like PackageCompiler.jl given the stuff Simon says in the README about globals and uninferred stuff.

pkofod commented 6 years ago

So really what should be in focus should be to move to a more type oriented (is named tuples inferrable here?) design for plot attributes? I appreciate the other efforts being made in the ecosystem, but it would be sad to have all of Plots be for nothing all due to the time it takes to get to the first plot.