JuliaPlots / StatsPlots.jl

Statistical plotting recipes for Plots.jl
Other
437 stars 88 forks source link

add feature to change x-axis order for boxplots #483

Closed Tobias-Thomas closed 1 year ago

Tobias-Thomas commented 2 years ago

Fixes #409, by adding by as a new optional parameter to boxplots. This by is the same, one would use for sort. It's default parameter is the identity. Before merge, can someone please tell me, where I can add the documentation for this new parameter? Thanks in advance.

sethaxen commented 2 years ago

@Tobias-Thomas thank you for the contribution, and I'm very sorry for the late review! I know I recommended the keyword by, but in retrospect, I think that's too generic of a name. What about something like sort_labels_by, or perhaps you have a better suggestion?

Tobias-Thomas commented 2 years ago

Oh, no problem. The only other idea, I would have is sort_x_by, since the parameter it sorts is called x. What do you think?

indymnv commented 1 year ago

I was searching for a way to sort boxplots, but still is not solved or I am wrong?, is there a way I can help to fix this?

Tobias-Thomas commented 1 year ago

The fix I implemented was essentially quite easy, but as you can see above, we were still unsure about the name of this parameter, and I think we need a JuliaPlots member to decide on that.

indymnv commented 1 year ago

What about order to keep a similar idea like seaborn, sort_labels_by also can be OK, sort_x_by can be a little bit confused if later there is an implementation using horizontal boxplots.

Tobias-Thomas commented 1 year ago

Yea, I understand that point. Out of those options, I would prefer sort_labels_by. I will do a commit in my branch (switching to this name) later this day and maybe we could ping Seth again, to see if he approves.

codecov-commenter commented 1 year ago

Codecov Report

Patch and project coverage have no change.

Comparison is base (90b5f4f) 26.52% compared to head (3164468) 26.52%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #483 +/- ## ======================================= Coverage 26.52% 26.52% ======================================= Files 20 20 Lines 1248 1248 ======================================= Hits 331 331 Misses 917 917 ``` | [Impacted Files](https://app.codecov.io/gh/JuliaPlots/StatsPlots.jl/pull/483?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaPlots) | Coverage Δ | | |---|---|---| | [src/boxplot.jl](https://app.codecov.io/gh/JuliaPlots/StatsPlots.jl/pull/483?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaPlots#diff-c3JjL2JveHBsb3Quamw=) | `0.00% <0.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

Tobias-Thomas commented 1 year ago

@sethaxen, I updated the name and merged it into the current master. Is there anything left to do for me in this PR?

jonathan-durbin commented 1 year ago

Hi, any updates on this? I'm running into this issue while trying to figure out how to sort the x axis in a boxplot.

Tobias-Thomas commented 1 year ago

I think the fix would be ready for merge, but I do not know who to ping, to review the PR..

indymnv commented 1 year ago

Maybe @sethaxen , can help us with this PR

Vaeryus commented 1 year ago

So how do I use sort_labels_by now? Can I just put in an order for the x axis like: sort_labels_by= [ "B" "C" "A"] or how would I use it?

Tobias-Thomas commented 1 year ago

So how do I use sort_labels_by now? Can I just put in an order for the x axis like: sort_labels_by= [ "B" "C" "A"] or how would I use it?

No, the function uses the same interface as Base.sort. That means that values can be sorted by a transformation of their values. So you need to write a function, which transforms the values into a space where they are sorted the way you want it.

artkel commented 11 months ago

Can someone please explain how the category sorting for boxplots now works? It is not quite clear to me from the previous comment. Would be nice to see an example. Thanks!

artkel commented 11 months ago

Can someone please explain how the category sorting for boxplots now works? It is not quite clear to me from the previous comment. Would be nice to see an example. Thanks!

Ok, I think I got it. The example below explains the logic. It worked for me.

# Define a function to map the categories to an order
function order_label(label::String)
    order_dict = Dict("B" => 1, "C" => 2, "A" => 3, "D" => 4)
    return get(order_dict, label, 0)  # Return 0 for any unexpected labels
end

# Now use this function in the sort_labels_by parameter
@df dataset_name boxplot(
    string.(:CategorcalVar),
    :NumericalVar,
    outliers=true,
    sort_labels_by=order_label,
    fillalpha=0.75,
    linewidth=1,
    legend=false
)

It would be nice to have a similar parameter for the violin plot.

mkborregaard commented 11 months ago

Before this PR the way to do that was to use levels! on the CategoricalArray x vector to define the sorting order. Is this really an improvemet on that?