ggobi / ggally

R package that extends ggplot2
http://ggobi.github.io/ggally/
584 stars 119 forks source link

Only the first y axis displays the counts in ggpairs #259

Open rizbicki opened 6 years ago

rizbicki commented 6 years ago

I'm running the following code:

library(GGally)
data <- data.frame(x=runif(100),y=runif(100),z=runif(100))
ggpairs(data, 
    diag=list(continuous="barDiag"))

I get the following plot

test

The y axis of the top left panel is weird: it shows the counts on the histogram, and not the values of the variables. However, the y axis of the other variables are their values, and not the counts. Is this behavior expected?

Thanks

.

schloerke commented 6 years ago

@rizbicki

Yes, this is expected behavior. This is a side effect of each of the left column and bottom row plots displaying their individual axes. With the ability to retrieve each plot individually, it does not make sense to arbitrarily change the axes of a plot to something that isn't what the plot is displaying (such as changing the counts to display a y range). However, the correlation does not display anything of data, rather just text, so a grid can be added arbitrarily.

To get around this unexpected behavior (but at the cost of losing the histograms), you can place the axes in the diagonal.

library(GGally)                            
ggpairs(iris, 1:3, axisLabels = "internal")

schloerke commented 6 years ago

(Closing as it is expected behavior). Feel free to comment if you have more questions!

KunhuanLiu commented 1 year ago

Hi, I agree that the y axes should be pertinent to the graphs plotted, and it did not bother when I used upper triangle to show statistics. However, when I use the upper triangle to show 2D density plots or scatterplots, the first y-axis caused trouble and some confusion. I also noticed in Seaborn's pairplot, they use first y-axis for y range.

I am wondering if there is any flexibility for users to add axes to the individual grid plots, if customizing for the edge y or x-axis is not possible.

schloerke commented 1 year ago

Copying in the image from Seaborn's pairplot: image

The top left plot has a y axis that represents the majority of the top row, but is attached to a plot that does not use that axis range. That plot uses an axis range of "count", not bill length in mm.

This feels really strange at first glance.


Seaborn's approach does provide a better axis to k-1 plots, rather than just the immediate (single) plot.

@dicook What are you thoughts on switching the default axis label in the top left and bottom right to the "top left" and "bottom right" axes of the upper triangle? (Iff there are lower triangle plots. Iff axisLabels != "internal")

Screenshot 2023-02-24 at 4 48 21 PM

KunhuanLiu commented 1 year ago

I completely agree that to use length in nm as axis where the immediate plot is a histogram can be misleading. On the other hand, when upper triangle region do have scatter plots / density 2d plots, audience may be interested to read off the axes.

I wonder if placing the physical-unit axes on shown places is something I can do at this moment (without having you to change the default behavior): 1) show a y-axis on the 2nd plot where axis in physical unit is preferred to read graphs (knowing that there will be limited space, so axis like "1000, 2000, 3000,...6000" wouldn't fit) 2) show a y-axis in physical unit on the top right hand side for the upper triangles. Turn off the count axis.

221297432-16d0fb30-597d-4971-ba98-9ba03b8b0466

Related thought: Histograms in the diagonal don't have count axes either. I doubt people would really care, and if the counts do matter, it should be able to be turned on for any diagonal histograms; not sure how to do it at the moment.

Thanks for the timely response and all the work,

dicook commented 1 year ago

I think we were originally motivated by ggplot facets, but I do agree that the left and bottom could be a better default.Sent from my iPhoneOn 25 Feb 2023, at 11:32 am, Kunnn @.***> wrote: I completely agree that to use length in nm as axis where the immediate plot is a histogram can be misleading. On the other hand, when upper triangle region do have scatter plots / density 2d plots, audience may be interested to read off the axes. I wonder if placing the physical-unit axes on shown places is something I can do at this moment (without having you to change the default behavior):

show a y-axis on the 2nd plot where axis in physical unit is preferred to read graphs (knowing that there will be limited space, so axis like "1000, 2000, 3000,...6000" wouldn't fit) show a y-axis in physical unit on the top right hand side for the upper triangles. Turn off the count axis.

Related thought: Histograms in the diagonal don't have count axes either. I doubt people would really care, and if the counts do matter, it should be able to be turned on for any diagonal histograms; not sure how to do it at the moment. Thanks for the timely response and all the work,

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>