Closed kdpsingh closed 1 year ago
Hi, Prof. Singh, I'm a new user of Tidier.jl and also was a user of tidyverse. I appreciate your work to bring great convinience for the migration of "R -> Julia". I tried the main visualization tools in Julia ecosystem, and I believe you did a much deeper exploration.
However, from a user's prospect, I think AlgebraOfGraphics.jl is more suitable to be integrated to Tidier.jl, the reasons are
Thank you @ivaquero. I appreciate your comment. I'll add a few things, having looked closely at both APIs. I'm not 100% sure we will use Gadfly, but leaning towards Gadfly for a few reasons as a starting point.
I do agree that AoG has a user-facing API out of the box that is closer to ggplot. However, if we use Gadfly as a backend, Tidier would be still exposing a ggplot-like interface to the end-user. In other words, you could think of Tidier's relationship to Gadfly as a similar relationship between AoG and Makie (although AoG and Makie are run by the same org).
Things in favor of Gadfly:
Gadfly's plotting options seem to be a closer fit to ggplot from a standpoint of wrapping its functionality (including handling situations like faceting)
Gadfly appears to me to be a much lighter dependency than AoG
I agree that Gadfly has a lot of open issues and last release was a year ago, which is suggestive of less active maintenance. That said, there have been frequent commits to GitHub, and almost every issue has a reply from the maintainers. The other thing is that Gadfly doesn't yet support SnoopPrecompile to reduce TTFP in Julia 1.9. I'm hoping that is ultimately addressed and plan to file an issue.
In summary, I agree with you on the advantages of AoG. But I think Gadfly also has advantages from a standpoint of wrapping its functionality to expose a ggplot-like-API, so we are going to start there.
If we run into barriers or issues, we can always switch backends later.
I'd still love to hear from you and others. I have experienced occasional weird plot aspect ratios when using Gadfly, which is something I can't directly fix. I haven't played enough with all of the plotting packages to see how much this affects different packages.
What do you and other folks think with regards to the quality of resulting plots for Gadfly vs AoG/Makie?
I agree that AoG is a heavier dependency. The points of my comment come out of my worries about Julia pkgs.
I followed Julia 3 years ago. At that time, Julia world prospered drastically, almost every package development were pretty active. But since last year, some famous pkg like Turing.jl became very quiet. Many former contributers graduated or are busy with more important things.
Whether a bug can get fixed in time, often matters more for a user.
I really appreciate and share this concern. I agree that with MakieCon happening, and with other developments, it looks like the Makie ecosystem appears to be quite active right now in comparison to others.
Wrapping AoG is also doable but considerably more complicated. Will plan to start with a minimal implementation of Gadfly as a backend, and once we get some user feedback, will decide if we need to switch.
The difficulty is that our goal isn't to put together an implementation that is merely inspired by ggplot. Rather, the goal is to more or less re-implement ggplot and retain its look and feel.
Looking again at AoG, it may be doable. Going to continue to study this issue. Stay tuned.
I think it depends on the long term goals of ggplot implementation wrt the user experience. Do we see this as a "stepping stone" package that is sufficient to get people making basic plots in julia while they get familiar with the language, after which they move on to a native plotting package that suits their needs? In this case, the right path is to wrap the package that will be easiest (Gadfly, it sounds like).
On the other hand, if the goal is long term use with the same level of functionality as "real" ggplot, we may need to put in the extra time to wrap AoG
or - ideally - Makie
itself, bypassing the AoG dependency altogether, since that seems like the package that will get the most support going forward?
Just another vote for the Makie
backend. As an R user who has been trying Julia for the last year or so, on and off, I ended up switching to Makie
as Gadfly
had too many issues when it came to anything more than standard plots e.g. trying to set alpha
values for a Geom
with multiple levels.
In my view, one of the major advantages of ggplot2
is that it not only provides a great user experience for quick and simple plots, but is hugely flexible and extensible with its ggproto
OO system allowing for package developers to create extensions that integrate into the ggplot2
workflow pretty seamlessly. Julia obviously has Structs and multiple dispatch built into it, but I believe Makie
will allow for a great number of packages to be able to be built to extend ggplot2.jl
(or whatever you call the package) as Makie
itself allow you do do more things than Gadfly
.
Thanks everyone. With that feedback in mind, I'm leaning towards trying to wrap AlgebraOfGraphics.jl, which uses Makie under the hood and plays nicely with the Makie ecosystem (since they wrote AoG).
Would that suffice as a starting point?
I think that's a great starting point.
I would offer to try and add some commits, but I'm afraid I don't know anything about meta programming in Julia (though I do with tidy-eval
and it's on my list of to-do's)/I'm still relatively new at actually programming in Julia rather than just data analysis, so it would involve a lot of hand-holding and likely wouldn't be that helpful in the development process.
That being said, please let me know if you think I could be useful in any way, as this is an exciting development to help R users transition to Julia!
Ok, just to be respectful, let me open an issue on AoG's GitHub page to make sure their team is aware and cool with this effort.
@arnold-c Regardless of the backend, my plan in the medium-term is to keep this functionality all within Tidier.jl rather than break it up into a separate package. My rationale for this:
I view Tidier.jl as a meta-package (since it wraps other packages) and so it's more similar to the tidyverse meta-package in its aspiration than any individual package (eg dplyr) within tidyverse.
A lot of the R feel is driven by parsing functions that work well together. If breaking into multiple packages, much of that functionality would need to be duplicated and updated together (or placed into yet one more package).
That said, if other packages start to depend on Tidier.jl (which hasn't really happened yet), then this could become a problem. I think we can revisit this at a later time.
We have already split off this functionality into a separate package (TidierPlots.jl) so closing this issue.
The plan is to wrap Gadfly.jl with some syntactic sugar.
We will implement this using macros. We will start by supporting a pipe-based approach to sequentially building a plot. May extend to support
+
as a future step.I already reached out to Gadfly.jl maintainers, they are okay with this: https://github.com/GiovineItalia/Gadfly.jl/issues/1611