Open timelyportfolio opened 6 years ago
Hi there! I'd be keen to contribute to this project with you. G2 looks really impressive so I agree with you that having an R API would be useful. I'll have to think about some ideas of interesting ways to do this as well so hopefully I can back with something to share with you later in the day. Cheers
@mdequeljoe, great to hear. I just pushed quite a few changes, so now this is a package. I'll try to wrap as an htmlwidget over the next couple of days to really deeply explore some of our API options.
Ok i had a look at trying out R6 as an interface to G2 here: https://github.com/mdequeljoe/rg2 via a very minimal package (binds only a couple geoms/attributes from G2). The basic idea was to have some friendly way of adding to a fixed list structure like in your sketches and then pass any inclusion forward. Maybe you already had another approach in mind, but would be glad in any case to hear your thoughts on this possibility!
@mdequeljoe, thanks so much for this poc. I would like to work through all the various API options/approaches in a manner similar to this, so this is extremely helpful. Not sure how best to communicate through this. Maybe I should set up an organization, so we can house all the proofs-of-concept in one place. Would you be ok with that?
I love R6
, but I would say in general most htmlwidgets
do not use this pattern. That does not make it a no-go, but is certainly something we should consider. I would say that +
or %>%
is the dominant approach. Our old rCharts
used R5
, so we have a pretty good reference point for additional analysis of cost/benefits of the object approach.
thanks for your response - that sounds good for the organisation. That's true i haven't seen another htmlwidgets package that uses R6 but my initial reaction was to try this way since I thought it may serve as a convenient way of adding to a chart object similar to how one would do that in javascript. I had in mind something like:
g <- g2$new()
g$config()
g$add_source(iris)
g$point()$position("Sepal.Width", "Sepal.Length")$color('Species')
g$render()
Also worth considering whether it would be helpful in handling attributes that would be applied to all the appended geoms (or in dealing with facets):
g <- g2$new()
g$config()
g$add_source(iris)
#position and colour applied to both point and line
g$position("Sepal.Width", "Sepal.Length")$color('Species')
g$point()
g$line()
g$render()
That being said, this was just my first take of the situation and I also like the idea of using pipes in general.
@timelyportfolio thinking about R6 vs pipes - one argument in favour of R6 would be that one wouldn't have to worry about any namespace collisions - which would allow for matching the G2 javascript API since alot of the methods have common names (e.g. line
, source
) if this is in fact the desirable direction to go in. Although as i said before - pipes work too if that is the direction you're leaning towards.
For instance - based on your recent sketches (and comments about ggplot2) i guess these are more or less the same:
g2_chart(data = mtcars, x = hp, y = mpg) %>%
g2_point() %>%
g2_path()
G2$new(source = mtcars, x = hp, y = mpg)$
point()$
line()$
render() ## not really necessary -could be set as the G2 print method instead
@timelyportfolio to quickly come back to this again - thinking about the option to make alignments with ggplot2 syntax style - using +
in place of %>%
could be a good idea no?
I am really struggling with this one. I see the benefits of R6
and personally like the approach, but I hesitate due to my opinion that most R
users are not familiar with construction using $
. It seems like we shouldn't fight convention from other packages and htmlwidgets, but I'm not sure I have enough context to say what truly is convention. On the htmlwidgets
side, I can say that %>%
is the dominant approach.
%>%
vs +
@mdequeljoe, normally I would say +
makes sense, but I was guided by the ggvis
decision to use %>%
and comments by Hadley suggesting that if the pipe existed he would have used it instead for ggplot2
. Beyond this consideration, in my workflows on a daily basis, I find myself trying to use %>%
instead of +
in a pipeline with a plot and that often results in an error on first run. Benefits of +
are easier-to-type and consistency with ggplot2
.
I really like the idea of R6 which could provide a fresh approach to doing this type of visualization idiom in R. We'd get a single, self-contained environment enabling 0 spillover into globalenv with both data & methods coming along for the ride. R6 also means no magrittr dependency. I'm somewhat biased as I use obj$method
alot (privately) due to Rcpp modules and rJava that I have to use to interface to libraries for work-work.
Despite my 💙 for %>%
and DSLs, there's quite a bit to be said for:
visobj <- Vis$new( … params … )
vis$add_a_param( … params … )
vis$more_params( … params … )
vis$render()
js <- vis$to_js()
vis$serialize("myfile.vis")
where you don't have to follow a block of chained %>%
, each vis$…
is standalone (and more debug-able). The ones that "change" things mutate in-place, and — depending on how well this is eventually designed — would support folks making superclasses to extend functionality. i.e. we cld get ggproto
-like functionality "for free" if designed well.
Some of those features can be partially mimicked in %>%
-land, but they're forced in R6 land.
While in some ways an obj$method
API would be simpler and not unfamiliar to users of more typically object-oriented languages, the one objection to implementations of it I've seen thus far in R is that the documentation of it can be nigh unusable. At best, they end up with something like this and giant vignettes, which are both slow to navigate and not well indexed.
I suspect there's a better way to document object methods, but I'm not sure how feasible it is within the current R package spec. I'd be very happy to be proved wrong, though.
@alistaire47 and @hrbrmstr thanks so much for your comments. I am sure it would be a maintenance nightmare, but I am wondering if we structure with R6
and then provide accessible functions for editing and manipulation for users who favor this approach. I worry though beyond maintenance how much confusion the dueling parallel approaches might cause.
From a user point of view, I would prefer a %>%
approach as it tends to provide a clearer and more uniform syntax. One drawback, as already mentioned, is that functions should have a verb like semantic (add_points
, etc.) like in leaflet
for example, and there can be namespace collision problems. g2_add_points
may be a bit long for a function name.
Please note that I have no experience with R6 so I'm unable to see what advantages an object approach would yield in terms of development.
While the tidyverse has adopted the verb-like semantic, all of R graphics predate and do not fit this approach. I would prefer shorter consistent functions rather than trying to squeeze in a verb, especially for the geoms, such as point, line, etc. In my mind +
implies add or %>%
implies add/modify, so unless we are attempting a different verb, I am ok without the verb.
Just an opinion though, please keep the conversation and comments flowing. Thanks!
If the %>%
and tidyverse promote the use of verb-like functions, I wonder what the current g2 implementation is like? I've looked at the website linked in the project, but I'm not so hot in (what I believe to be) Chinese. Is there someone with better linguistic skills that might be able to shed light on how the g2 package is natively designed? Would that be useful to inform this decision?
@alistaire47 fair point about the R6 documentation. This would probably be something to look into further although one would hope that the official G2 reference docs would be of some use, even to the R user. Although this may in part depend on how close an eventual R api would resemble the JS one.
@hrbrmstr I like this idea of allowing for extensions and it looks like this is provided for certain aspects such as in G2.shape (https://antv.alipay.com/zh-cn/g2/3.x/api/shape.html) although for other aspects such as facet extensions this would seem to make more sense to do this on the JS side.
@timelyportfolio I had a recent thought that in any case it could probably make sense to allow for a convenient way to use the JS api directly given that this is relatively easy to learn and use. Maybe something like?
g <- g2$new(animate = FALSE)$source(iris)
g$script(
"
//optional to initiate a new chart and source data
chart
.point()
.position('Sepal.Width*Sepal.Length')
.color('Species');
"
)
g$render()
I'd like to echo @juba 's comment. From a user perspective, I prefer %>%
over +
for consistency with the tidyverse (plus, I never liked switching to +
in a chained operation that included both data manipulation and a ggplot2 call, which happens very regularly).
I don't think the verb-like function (in conjunction with %>%) are mandatory. The keras
package also uses nouns; and I think it works quite well (side note: I also like how the pipe is successively update the main object, maybe that would be some in-between solution?).
Some code to start working through the integration. This is ugly and very hacky. Of course the ultimate objective is to write a full-featured
htmlwidget
that is usable for an R user with no knowledge of JavaScript.See sketch