Open elnaggarj opened 1 year ago
Hi thanks a lot for opening the PR. This is a much-asked-for feature.
Can you check if your implementation also works for adjusting p-values across facets? If I remember correctly, this was not trivial, and one of the reasons I did not include this originally.
I appreciate your response and for bringing up this issue. I did not think about faceting, and it's not as simple as it seems.
Currently the method will correct each faceted graph individually. Take the following example from mtcars where hp
is compared to gear
where I compare all relevant gear combinations (3 vs 4, 3 vs 5, and 4 vs 5; 3 comparisons)
As you can see, E and the top of D are the same figure and it does not adjust the P values for the whole figure. This is because the correction is based on the length of comparisons parameter (in this case 3).
In order to solve this we could either include this as a caveat or have a way to incorporate the total number of comparisons into this method. I'm looking into ways to determine the total number of facets within the geom_signif function. Alternatively, we could just use a second argument that will specify the number of total comparisons. Let me know your thoughts!
Thank you for your time!
Thanks for confirming the issue
As you can see, E and the top of D are the same figure and it does not adjust the P values for the whole figure. This is because the correction is based on the length of comparisons parameter (in this case 3).
The problem goes even deeper than that, I am afraid. For the Benjamini-Hochberg correction (FDR), the "adjusted p-values" depend on all other p-values. It is necessary to somehow aggregate all p-values into one vector and call p.adjust
once. (There are some tricks for working with chunks described by Madar and Batista (2016), but I doubt that their method resolves our implementation problem).
As a side note, this problem is specific to the Benjamini-Hochberg method. For the Bonferroni correction specifying the number of tests would be sufficient.
In order to solve this we could either include this as a caveat or have a way to incorporate the total number of comparisons into this method. I'm looking into ways to determine the total number of facets within the geom_signif function. Alternatively, we could just use a second argument that will specify the number of total comparisons. Let me know your thoughts!
I am not convinced that correcting per facet is good enough. This could give the user a false feeling of comfort and I would rather have this obvious gap of unadjusted p-values, than subtly wrong FDR values.
I just checked the source for ggplot2::Stat
and noticed that there is a finish_layer
function which we could override. Could you check if that function is called with the full dataset, so that we could move the p.adjust
call to there?
Added Multiple P value testing via FDR functionality to ggsignf through the TRUE/FALSE parameter "FDR"