Open rdboyes opened 2 months ago
Can you give an example of how to call aes
with a calculated column? I'm trying to make sense of the function code but having trouble.
Is the goal to have something like @aes(x = sqrt(variable))
?
Yes - e.g. this example from the R ggplot docs:
so the ideal would be to have operators and functions work, like:
ggplot(penguins) + geom_point(@aes(x = bill_length_mm / 10, y = sqrt(bill_depth_mm))
Sorry I have been so slow on this one. If you'd like to work on it, I can outline what I think is the best way to do it in a reply to this message. Otherwise, I'm hoping to get to it soon-ish.
The output needs to be in the form of a "column transformation" - the way I've set up the draw code to run the calculations, that looks like an entry in a key => pair Dict with the following parts:
:x
)[:x, :y]
)function sort_by_fn(target::Symbol, source::Vector{Symbol}, data::DataFrame)
perm = sortperm(data[!, source[2]])
return Dict{Symbol, PlottableData}(
target => PlottableData(
data[perm, source[1]],
identity,
nothing,
nothing
)
)
end
sort_by = AesTransform(sort_by_fn)
I know the complexity is weirdly high here - and I'm open to any rewrite suggestions. This was the simplest structure I could come up with that was also flexible enough to encode all of the possible aes requirements (look for example at geom_contour
)
Does your code currently call TidierData?
My suggestion would be to pass the transformation through TidierData.@mutate
. That way, @aes
would support autovectorization, interpolation, and all the other goodies.
It does - what would mutate return, though? Would it be DataFrames syntax? We still have to store that and run it later, since when the aes is called, we don't necessarily have access to the DataFrame that we're referencing
That was going to be my question. Isn't the underlying data part of the ggplot struct?
If it was, I was thinking we could create a temporary column containing the mutated result and then reference that column.
That was how I was originally doing it, but some of the Makie plots require inputs that can't be stored as a DataFrame column (e.g. geom_contour
requires a Matrix as input)
Is there an existing way to have @mutate
return the code but not run it?
Would wrapping it inside of an anonymous function work?
For example, df -> @mutate(df, TidierPlots_x = mpg / 2)
? Here, df represents the data frame and not a column per se.
I'm still trying to wrap my head around why we couldn't modify the struct in place rather than defer the evaluation. I haven't tried it, so you definitely have a better understanding.
It's because in call like this:
ggplot(penguins) +
geom_point(@aes(bill_length_mm/10, bill_depth_mm))
geom_point(@aes(bill_length_mm/10, bill_depth_mm))
will evaluate to a Geom
object with no data in it - it will only inherit the data later when you go to draw it. So there's nothing for the mutate function to immediately act on
But wrapping it in an anonymous function could work, yes
Got it. Another option would be to wrap it inside of an expression. We would interpolate the expression into @mutate
when ready to draw the plot.
For example, mpg / 2
would become :(mpg / 2)
.
I've made progress on this - I was removing some of the "deep type piracy" and made it easier to implement as a consequence. Its almost working, but there's a questionable eval
call that I think is causing some troubles - take a look here if you get a chance: https://github.com/TidierOrg/TidierPlots.jl/blob/45a102f8f83c4b81c6a36e91970537a159602644/src/aes.jl#L44
Thanks! I will take a look at this.
0.6.3 added support for calculated aesthetics in the function
aes
- need to extend this feature to the macro@aes
.