JuliaStats / Clustering.jl

A Julia package for data clustering
Other
355 stars 118 forks source link

Plot recipe PR? #90

Closed kescobo closed 6 years ago

kescobo commented 7 years ago

I'm in the process of writing a user recipe for Plots.jl to enable plotting of Hclust see here. Generally, it makes sense to have the plotting recipes live in the package the generates the object, but it would require accepting RecipesBase.jl as a dependency.

Before I get too far into the development I was wondering if this would be a PR that you would be willing to take on.

ararslan commented 7 years ago

This is an opinion I've voiced in another package, but I don't think we should add dependencies on any particular plotting stack to any of the stats packages. While I can empathize with wanting to have a built-in way to plot objects from various packages, not everyone uses Plots for plotting, and I don't think we should be opinionated about which plotting stack our users should use. If/when optional dependencies become a thing, we can absolutely add conditional logic for having Plots or Gadfly or whatever installed, but in the meantime I think plotting code should live in a separate package.

kescobo commented 7 years ago

@ararslan That makes sense. Is there any reason to leave the issue open for more discussion, or should I just close it?

daschw commented 7 years ago

RecipesBase is a really lightweight dependency, though, and no user would be forced or urged to use or install Plots. But users, who have Plots and want to use it, would have nice built-in recipes.

mkborregaard commented 7 years ago

@ararslan I agree with your view on not attaching to any particular plotting package. In fact that's exactly what "recipes" are designed to address.

Because Julia encourages package designers to create new types, and because of multiple dispatch, it is very much in the spirit of julia to be able to define plotting methods that dispatch on user types. However, the atomised plotting package ecosystem makes this difficult. If any package choses to import a plotting package, such as Gadfly or Plots, this will create problems for any downstream packages with e.g. potential conflicts of plot.

The solution is for packages to not define plotting methods and not depend on plotting packages, but instead to define a set of "rules" or signals for how to plot a type (this is a line, these are points etc) that can then be used by in principle any plotting package to plot the type.

This is what a recipe is. They are designed explicitly to not require plotting packages or generate conflicts that enforce a certain plotting package to the user. Plots' real power is actually that it is a software that translates those signals into commands that can be passed to the plotting package preferred by the user - we have translation rules (called backends) essentially for all plotting packages that want to have this.

The current idea - to have interface packages such as GadflyClustering, PyPlotClustering, GRClustering, PlotlyJSClustering, GLVisualizeClustering, PGFPlotsClustering, WinstonClustering, BokehClustering, and then GadflyConvexHull, PyPlotConvexHull, etc etc... runs a real risk of atomizing the plotting ecosystem completely - and many of these packages are likely never to exist, leaving the user with the issue of getting GadflyClustering and PlotlyJSConvexHull to play together if he has a use for this combined functionality. I don't think the consequence is very likely to be increased user freedom when it comes to plotting packages.

RecipesBase is designed to be stable, extremely lightweight in terms of codes, dependencies and namespace, and to never introduce a conflict in a package that limits a user's or a three-dependencies-down package developer's free choice of plotting package. In fact, it defines a uniform interface for people to provide general plotting methods for user-definede types. To me, this is the core philosophy of Julian design.

ararslan commented 7 years ago

I understand the motivation behind Recipes, my reservation is that Plots is the only package that uses it, so it is effectively Plots-specific even if it doesn't intend to be.

mkborregaard commented 7 years ago

I'd be very interested to hear an alternative solution to this issue that would not involve having some package that could translate plotting instructions into plot package calls.

I definitely do see your point, of course. I've just been thinking really hard about this specific problem for a long time, and I think this is an elegant solution to a hard problem. (edited).

mkborregaard commented 7 years ago

I'm guessing this can be closed? @ararslan I hope you see my intentions are good here - I think this is an important concern for the julia ecosystem. It's just important for me that key persons like you and the rest of JuliaStats are aware (which you seem to be) that the RecipesBase system takes a lot of care to not conflict with other plotting packages.

Possibly a solution to the overall issue would be to have a package like StatPlots define recipes for every package in the JuliaStats organisation at once (though this will mean the user will have to load all these packages to do a plot, which seems like something that may cause problems).

And I should say I'm very grateful that you've decided not to associate the stats packages with a plotting package that causes conflicts. I definitely agree open choice for the user is the way to go.

kescobo commented 7 years ago

I'm ok with closing this issue - should I be the one to do it?

nalimilan commented 7 years ago

Too bad that we don't have optional dependencies yet. Let's hope they will be there in time for Julia 1.0.

ChrisRackauckas commented 6 years ago

Is there a glue package with the recipe somewhere? The package is oddly incomplete without it.

kescobo commented 6 years ago

@ChrisRackauckas I hacked something together in Microbiome.jl, you can find it here