Open KronosTheLate opened 2 years ago
I was convinced there was already an issue for this but could not find it. It could make sense to also rename data
as part of the breaking changes discussed at #345. Maybe with(df)
could make sense? It is analogous to DataFramesMeta.@with
, and with
is probably less likely to be a variable.
Agreed. I actually think that with is a pretty great alternative.
I can also think of the very explicit, but a little long, 'with_data' or 'withdata'. I am not sure it is better than 'with', but wanted to throw it out there.
Yes this was reported before in https://github.com/JuliaPlots/AlgebraOfGraphics.jl/issues/207
I think table(...)
would fit well, but it's more likely than with
to be a variable. On the other hand I would not be surprised if a with
function is defined by packages to be used with the do
block syntax.
Another option: since data
is rather redundant (df
is of course the data) it could be left out entirely:
plt = df * mapping(:a, :b) * visual(Lines)
So AoG would define Base.:*(df, l::Layer) = data(df) * l
, etc. and data
would be unexported.
I confess I would be a little hesitant to overload that method, but I was wondering whether another option could be to make Layer
callable, so that one would do eg
df |> mapping(:a, :b) * lines |> draw
I confess I would be a little hesitant to overload that method, but I was wondering whether another option could be to make
Layer
callable, so that one would do egdf |> mapping(:a, :b) * lines |> draw
This is powerful, expressive and concise! I also prefer this API.
Would this work in every DataFrame
or just the ones named exclusively df
?
I still have to read through https://github.com/JuliaPlots/AlgebraOfGraphics.jl/discussions/345 but to me, it's confusing that you can draw a plot without data
(pregrouped) by supplying the data directly to the mapping
.
Therefore, I'd think either that functionality should move out of mapping
for example by doing data(pregrouped_vectors) * mapping(1, 2)
for specifying the vectors via indices. Or the data functionality could move inside of mapping
because a mapping only really works by referencing a data source right? So to me it feels natural to pull the two together.
mapping(:x, :y, data = df) * visual(...)
@piever for the overloading of *
do you mean because it feels weird? (From a type piracy point of view it's legal as long as AoG owns one of the *
argument types...)
Anyway I love the pipe version! I was first hesitant because currently the datasets are regular players in the algebra, for example we can do
(data(df1)*mapping(:a, :b) + data(df2)*mapping(:x, :y)) * visual(markersize=10)
but actually it also works with pipes:
((df1 |> mapping(:a, :b)) + (df2 |> mapping(:x, :y))) * visual(markersize=10)
(This takes a few more parentheses but for such a rare use case it's no big deal.)
I think the semantics of piping are a great fit here, and callable layers make sense even without pipes:
line_plot = mapping(:a, :b) * visual(Lines, linewidth=3)
plt = line_plot(df) # Can be called on different datasets
If the pipe solution is adopted, it might be nice to add a "curried" version of draw
to allow things like
df |> mapping(:a, :b) * lines |> draw(axis=(300, 200))
@storopoli the variable name doesn't matter (it's only for macros that the variable name can make a difference. Edit: and some odd cases like f(; a)
and (; a) = x
).
for the overloading of
*
do you mean because it feels weird? (From a type piracy point of view it's legal as long as AoG owns one of the*
argument types...)
It is legal, but I suspect it is at least discouraged, in that it is a bit "punny" and bound to create method errors.
I was first hesitant because currently the datasets are regular players in the algebra
Yes, that's the reason why it was not done via piping. But, as you point out, it's still nice to work with different datasets, eg
(mapping(:a, :b)(df1) + mapping(:x, :y)(df2)) * visual(markersize=10)
And in general, I think layer1(df1) + layer2(df2)
reads reasonably well.
I wonder whether it could give rise to confusion if users wrote mapping(:a) * mapping(:b)(df)
and thought df
would only apply to mapping(:b)
. Maybe it's enough to document this and error if users try passing several datasets in the same layer, eg mapping(:a)(df1) * mapping(:b)(df2)
. These are all edgecases anyway, as I imagine the most common would be df |> layers |> draw
.
I still have to read through https://github.com/JuliaPlots/AlgebraOfGraphics.jl/discussions/345 but to me, it's confusing that you can draw a plot without
data
(pregrouped) by supplying the data directly to themapping
.
Doing pregrouped data from a dataframe will definitely be supported (whereas it isn't right now). There is some discussion in #345 (we should probably continue this particular discussion there) as to whether plotting from vectors, eg mapping(rand(10), rand(10)) |> draw
, should we supported with mapping
or with another name, but I think we should allow it for users who do not generally work with tables.
AlgebraOfGraphics.jl exports
data
. But it is one of the most common variable names, leading to frequent name clashes. I do not see what the best solution is, but as I find this troublesome, I wanted to raise the issue.