TidierOrg / TidierPlots.jl

Tidier data visualization in Julia, modeled after the ggplot2 R package.
MIT License
196 stars 7 forks source link

@aes macro update to support calculated aesthetics #71

Open rdboyes opened 2 months ago

rdboyes commented 2 months ago

0.6.3 added support for calculated aesthetics in the function aes - need to extend this feature to the macro @aes.

adknudson commented 2 months ago

Can you give an example of how to call aes with a calculated column? I'm trying to make sense of the function code but having trouble.

Is the goal to have something like @aes(x = sqrt(variable))?

rdboyes commented 2 months ago

Yes - e.g. this example from the R ggplot docs: image

so the ideal would be to have operators and functions work, like:


ggplot(penguins) + geom_point(@aes(x = bill_length_mm / 10, y = sqrt(bill_depth_mm)) 
kdpsingh commented 2 months ago

Sorry I have been so slow on this one. If you'd like to work on it, I can outline what I think is the best way to do it in a reply to this message. Otherwise, I'm hoping to get to it soon-ish.

rdboyes commented 2 months ago

The output needs to be in the form of a "column transformation" - the way I've set up the draw code to run the calculations, that looks like an entry in a key => pair Dict with the following parts:

function sort_by_fn(target::Symbol, source::Vector{Symbol}, data::DataFrame)
    perm = sortperm(data[!, source[2]])

    return Dict{Symbol, PlottableData}(
        target => PlottableData(
            data[perm, source[1]],
            identity,
            nothing,
            nothing
        )
    )      
end

sort_by = AesTransform(sort_by_fn)
rdboyes commented 2 months ago

I know the complexity is weirdly high here - and I'm open to any rewrite suggestions. This was the simplest structure I could come up with that was also flexible enough to encode all of the possible aes requirements (look for example at geom_contour)

kdpsingh commented 2 months ago

Does your code currently call TidierData?

My suggestion would be to pass the transformation through TidierData.@mutate. That way, @aes would support autovectorization, interpolation, and all the other goodies.

rdboyes commented 2 months ago

It does - what would mutate return, though? Would it be DataFrames syntax? We still have to store that and run it later, since when the aes is called, we don't necessarily have access to the DataFrame that we're referencing

kdpsingh commented 2 months ago

That was going to be my question. Isn't the underlying data part of the ggplot struct?

If it was, I was thinking we could create a temporary column containing the mutated result and then reference that column.

rdboyes commented 2 months ago

That was how I was originally doing it, but some of the Makie plots require inputs that can't be stored as a DataFrame column (e.g. geom_contour requires a Matrix as input)

rdboyes commented 2 months ago

Is there an existing way to have @mutate return the code but not run it?

kdpsingh commented 2 months ago

Would wrapping it inside of an anonymous function work?

For example, df -> @mutate(df, TidierPlots_x = mpg / 2)? Here, df represents the data frame and not a column per se.

I'm still trying to wrap my head around why we couldn't modify the struct in place rather than defer the evaluation. I haven't tried it, so you definitely have a better understanding.

rdboyes commented 2 months ago

It's because in call like this:

ggplot(penguins) + 
  geom_point(@aes(bill_length_mm/10, bill_depth_mm))

geom_point(@aes(bill_length_mm/10, bill_depth_mm)) will evaluate to a Geom object with no data in it - it will only inherit the data later when you go to draw it. So there's nothing for the mutate function to immediately act on

But wrapping it in an anonymous function could work, yes

kdpsingh commented 2 months ago

Got it. Another option would be to wrap it inside of an expression. We would interpolate the expression into @mutate when ready to draw the plot.

For example, mpg / 2 would become :(mpg / 2).

rdboyes commented 1 week ago

I've made progress on this - I was removing some of the "deep type piracy" and made it easier to implement as a consequence. Its almost working, but there's a questionable eval call that I think is causing some troubles - take a look here if you get a chance: https://github.com/TidierOrg/TidierPlots.jl/blob/45a102f8f83c4b81c6a36e91970537a159602644/src/aes.jl#L44

kdpsingh commented 5 days ago

Thanks! I will take a look at this.