Closed pdeninis closed 1 year ago
can you please provide a reproducible code ?
Before sending the code I have run it on R 4.0.2. There the function works properly.
The graph above had been run in R 3.6.1, but all the packages had been updated. Maybe that was causing the issue?
Anyway, thank you for your package, and sorry for the false alarm!
EDIT Wait! I didn't realize there is a different issue
Now the positions are correct but the asteriscs and the "ns" on the first two brackets are swapped! The first version above was that right...
Here the code. Let me know if it works. REMOVED SINCE NOT WORKING. SEE BELOW.
Hi Alboukadel,
I have now installed R 4.0.5 with all new package versions. Unfortunately the result is again the preceding one, i.e. correct significance marks but wrong brackets positioning:
Hi Aboukadel, I apologize since the code I posted was incomplete: I didn't realized.
I fixed it and now it 's working.
library(rstatix)
library(ggpubr)
library(tidyverse)
library(emmeans)
test <-
structure(list(TREAT = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L), .Label = c("CAF+PRF", "CAF+SCTG", "CAF"), class = "factor"),
GT_PRE = c(0.75574694999807401, 0.51377910615185396, 1.0502637416109999,
1.0529533576294099, 0.77628725655794695, 1.3024972039373,
1.29771873358056, 0.73223467651353302, 1.22548037918804,
0.70149281871692604, 1.2311467357729, 0.46592903526631801,
1.3053788021822601, 1.12747309255473, 1.0169638634654701,
1.46867879701079, 1.3887822192828201, 1.65913611414644, 0.94901409760925304,
0.68205176445635995, 0.75644708395693905, 1.3616212413887501,
0.80383327604294097, 0.83230193955718301, 0.55268821321151596,
1.0519620174778901, 0.85095516359899204, 1.4247162705459,
1.1978302782129999, 1.2008821995972401, 1.31677721147256,
1.0739940342185701, 0.56676654709428298, 0.758831881198758,
1.3437548491340201, 0.65625798126018697, 1.3642652746664301,
0.55179612606860495, 0.65226664683987401, 2.0293138768199199,
0.51765428550824999, 1.44939808663943, 1.4351805942674301,
0.79060226551773405, 1.30006971367069, 0.43285805219381801,
1.3771237596136201, 1.3424065948697199, 1.36634025195904,
0.91555239568191604, 1.3080349133981199, 0.91134124601174404,
0.95967271975461899, 0.30459336422795502, 1.2390723762771501,
1.03821439983465, 1.5266871239430899, 0.55959657184888001,
0.69628740796219102), GT_CH = c(1.5037592104009601, 0.37708778079514599,
0.45235941364484999, 1.47467219782683, 1.38937977222239,
0.38294772070726102, 0.53507913236442095, 0.51621910868277399,
1.5460292374508999, 0.57829211304672301, 0.78354210363062204,
1.27983126105837, 0.65175904047684896, 0.64853873777691096,
1.67261015217373, 0.37028803346980099, 0.80590716432980702,
0.73263456408220995, 0.70055849005411996, 1.2038454466045301,
0.85383942914639899, 1.2033768482948399, 0.78518775133726304,
1.1532594566047101, 0.95706532630658103, 1.09035994424734,
0.96203108898110901, 0.87836955861037402, 1.1076942857758001,
0.91892022371736704, 0.91944572314908601, 1.23935985446435,
0.96783791234521399, 0.864725786870432, 0.91795738647703295,
0.960009924034951, 1.1132551653164899, 1.0893876577154999,
0.81407123000060999, 0.45680795412162301, 0.36293109415511299,
0.302514131741624, 0.209648306428177, 0.33702680696005899,
0.34665042370576998, 0.24027654459536399, 0.18939783794448001,
0.29675637555508699, 0.50588365046038697, 0.38409189747421602,
0.37343044300995298, 0.215801805337029, 0.25930223824998,
0.35320250264132103, 0.25379570499944498, 0.22825343224356601,
0.31567886771034698, 0.49647580510056799, 0.172074177565914
), Subgroup = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L,
1L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L,
1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L,
1L, 2L, 1L, 2L, 1L, 1L), .Label = c("LOW", "HIGH"), class = "factor")), row.names = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L,
30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L,
43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L,
56L, 57L, 58L, 59L, 60L), class = "data.frame")
res.aovo<-test %>% anova_test(GT_CH ~ TREAT * Subgroup)
pwco <- test %>% group_by(TREAT) %>% emmeans_test(GT_CH ~ Subgroup, p.adjust.method = "bonferroni")
pwco <- pwco %>% add_xy_position(x = "TREAT")
bxpo <- ggpubr::ggboxplot(test, x = "TREAT", y = "GT_CH", color = "Subgroup", palette = "jco" )+ labs(y = "Gingival Thickness Increase (mm)") + labs(title = "GT Increase Factorial ANOVA Treatment * Subgroup (Baseline cutoff = 1.2)", subtitle="(without case 17)")
bxpo + stat_pvalue_manual(pwco) + labs( subtitle = get_test_label(res.aovo, detailed = TRUE), caption = get_pwc_label(pwco)) + labs(y = "Gingival Thickness Increase (mm)") + labs(title = " GT Increase ~ Treatment * Subgroup (Baseline cutoff = 1.2) w/o case `17")
Unfortunately the result is still the following:
Hi Alboukadel,
I took a look into the content of the
pwco<-test %>% group_by(TREAT) %>% emmeans_test(GT_CH ~ Subgroup, p.adjust.method = "bonferroni")
assignment.
Unexpectedly, I saw:
str(pwco)
tibble [3 x 15] (S3: rstatix_test/emmeans_test/tbl_df/tbl/data.frame)
$ TREAT : chr [1:3] "CAF" "CAF+PRF" "CAF+SCTG"
$ term : chr [1:3] "Subgroup" "Subgroup" "Subgroup"
$ .y. : chr [1:3] "GT_CH" "GT_CH" "GT_CH"
$ group1 : chr [1:3] "LOW" "LOW" "LOW"
$ group2 : chr [1:3] "HIGH" "HIGH" "HIGH"
$ df : num [1:3] 53 53 53
$ statistic : num [1:3] -0.175 3.289 -0.065
$ p : num [1:3] 0.86202 0.00179 0.94843
$ p.adj : num [1:3] 0.86202 0.00179 0.94843
$ p.adj.signif: chr [1:3] "ns" "**" "ns"
$ y.position : num [1:3] 1.814 1.38 0.647
$ groups :List of 3
..$ V1: chr [1:2] "LOW" "HIGH"
..$ V1: chr [1:2] "LOW" "HIGH"
..$ V1: chr [1:2] "LOW" "HIGH"
$ x : num [1:3] 1 2 3
$ xmin : num [1:3] 2.8 0.8 1.8
$ xmax : num [1:3] 3.2 1.2 2.2
- attr(*, "args")=List of 10
..$ data : tibble [59 x 4] (S3: grouped_df/tbl_df/tbl/data.frame)
.. ..$ TREAT : Factor w/ 3 levels "CAF+PRF","CAF+SCTG",..: 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ GT_PRE : num [1:59] 0.756 0.514 1.05 1.053 0.776 ...
.. ..$ GT_CH : num [1:59] 1.504 0.377 0.452 1.475 1.389 ...
.. ..$ Subgroup: Factor w/ 2 levels "LOW","HIGH": 1 1 1 1 1 2 2 1 1 1 ...
.. ..- attr(*, "groups")= tibble [3 x 2] (S3: tbl_df/tbl/data.frame)
.. .. ..$ TREAT: Factor w/ 3 levels "CAF+PRF","CAF+SCTG",..: 1 2 3
.. .. ..$ .rows: list<int> [1:3]
.. .. .. ..$ : int [1:19] 1 2 3 4 5 6 7 8 9 10 ...
.. .. .. ..$ : int [1:20] 20 21 22 23 24 25 26 27 28 29 ...
.. .. .. ..$ : int [1:20] 40 41 42 43 44 45 46 47 48 49 ...
.. .. .. ..@ ptype: int(0)
.. .. ..- attr(*, ".drop")= logi TRUE
..$ formula :Class 'formula' language GT_CH ~ Subgroup
.. .. ..- attr(*, ".Environment")=<environment: 0x000000002ab63c38>
..$ covariate : NULL
..$ ref.group : NULL
..$ comparisons : NULL
..$ p.adjust.method: chr "bonferroni"
..$ conf.level : num 0.95
..$ model : NULL
..$ detailed : logi FALSE
..$ method : chr "emmeans_test"
- attr(*, "emmeans")= tibble [6 x 8] (S3: tbl_df/tbl/data.frame)
..$ TREAT : Factor w/ 3 levels "CAF+PRF","CAF+SCTG",..: 1 1 2 2 3 3
..$ Subgroup : Factor w/ 2 levels "LOW","HIGH": 1 2 1 2 1 2
..$ emmean : num [1:6] 1.012 0.609 0.998 1.006 0.305 ...
..$ se : num [1:6] 0.0743 0.0973 0.0665 0.1152 0.0814 ...
..$ df : num [1:6] 53 53 53 53 53 53
..$ conf.low : num [1:6] 0.863 0.414 0.864 0.776 0.142 ...
..$ conf.high: num [1:6] 1.161 0.804 1.131 1.237 0.468 ...
..$ method : chr [1:6] "Emmeans test" "Emmeans test" "Emmeans test" "Emmeans test" ...
..- attr(*, "estName")= chr "emmean"
..- attr(*, "clNames")= chr [1:2] "lower.CL" "upper.CL"
..- attr(*, "pri.vars")= chr [1:2] "TREAT" "Subgroup"
..- attr(*, "adjust")= chr "none"
..- attr(*, "side")= num 0
..- attr(*, "delta")= num 0
..- attr(*, "type")= chr "link"
..- attr(*, "mesg")= chr "Confidence level used: 0.95"
Namely, the TREAT variable first appears as a characters variable and then as a factor. I promptly checked the old output I had conserved and compared the two:
#Old output
>pwco <- pasquale17 %>% group_by(TREAT) %>% emmeans_test(GT_CH ~ Subgroup, p.adjust.method = "bonferroni")
>pwco
# A tibble: 3 x 9
TREAT .y. group1 group2 df statistic p p.adj p.adj.signif
* <fct> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
1 CAF+PRF GT_CH LOW HIGH 53 3.29 0.00179 0.00179 **
2 CAF+SCTG GT_CH LOW HIGH 53 -0.0650 0.948 0.948 ns
3 CAF GT_CH LOW HIGH 53 -0.175 0.862 0.862 ns
# Current output
> pwco <- test %>% group_by(TREAT) %>% emmeans_test(GT_CH ~ Subgroup, p.adjust.method = "bonferroni")
> pwco
# A tibble: 3 x 10
TREAT term .y. group1 group2 df statistic p p.adj p.adj.signif
* <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
1 CAF Subgr~ GT_CH LOW HIGH 53 -0.175 0.862 0.862 ns
2 CAF+P~ Subgr~ GT_CH LOW HIGH 53 3.29 0.00179 0.00179 **
3 CAF+S~ Subgr~ GT_CH LOW HIGH 53 -0.0650 0.948 0.948 ns
> str(test)
'data.frame': 59 obs. of 4 variables:
$ TREAT : Factor w/ 3 levels "CAF+PRF","CAF+SCTG",..: 1 1 1 1 1 1 1 1 1 1 ...
$ GT_PRE : num 0.756 0.514 1.05 1.053 0.776 ...
$ GT_CH : num 1.504 0.377 0.452 1.475 1.389 ...
$ Subgroup: Factor w/ 2 levels "LOW","HIGH": 1 1 1 1 1 2 2 1 1 1 ...
As you can see, the tibble has now 10 columns instead of 9 (there is an extra "term" column). It results even in the datanovia site, that is apparently not yet updated. In particular, in each occurence of the group_by() function, the first column, the grouping variable, is still reported as a factor.
The important difference is that the grouping column (TREAT), which was earlier reported as a <fct>
, is now reported as a <chr>
column, although the original TREAT variable in the data is always a factor.
Now the CAF+PRF TREAT is in the second row, so it gets the positioning of the CAF+SCTG TREAT (2nd row, according to the ordering as a factor), CAF+SCTG is in the third (and gets the positioning of CAF, which is in the 3rd row, according to the ordering as a factor) etc. This behavior perfectly matches that of the brackets.
I guess the issue might have passed unnoticed so far since in this case the factor TREAT has undergone a levels reordering, so that they do no longer match those of the character variable. I think that if using a factor whose levels have not been "reordered" the issue does not occur.
Therefore, I fear the issue was due to a change in the dplyr group_by() function. However, I believe it is worth finding a workaround to fix it in ggpubr, allowing for the use of a factor whatsoever.
The way to reorder the new character vector :
> pwco$TREAT
[1] "CAF" "CAF+PRF" "CAF+SCTG"
to match the factor vector, could be to add:
pwco$TREAT[order(order(levels(TREAT)))]
[1] "CAF+PRF" "CAF+SCTG" "CAF"
Unfortunately, while:
pwco[order(order(levels(TREAT))),]
# A tibble: 3 x 10
TREAT term .y. group1 group2 df statistic p p.adj p.adj.signif
<chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
1 CAF+P~ Subgr~ GT_CH LOW HIGH 53 3.29 0.00179 0.00179 **
2 CAF+S~ Subgr~ GT_CH LOW HIGH 53 -0.0650 0.948 0.948 ns
3 CAF Subgr~ GT_CH LOW HIGH 53 -0.175 0.862 0.862 ns
works as intended, adding the code to the stat_pvalue_manual() function is not sufficient to fix it, but results in an error:
bxpo + stat_pvalue_manual(pwco[order(order(levels(TREAT))),]) + labs( subtitle = get_test_label(res.aovo, detailed = TRUE), caption = get_pwc_label(pwco)) + labs(y = "Gingival Thickness Increase (mm)") + labs(title = " GT Increase ~ Treatment * Subgroup (Baseline cutoff = 1.2) w/o case 17")
Error in .valide_y_position(y.position, data) :
can't find the y.position variable 'y.position' in the data
Therefore I think either a way to reorder the new output from group_by() according to that of the factor variable (in the plot) should be found or a nominal addressing to the groups should be used.
Another solution may be to warn the dplyr developers of the problem.
The issue is caused by broom:tidy() at https://github.com/kassambara/rstatix/blob/c36a70f22a8aeac58ffc9190decd70db18d0ebd1/R/emmeans_test.R#L135, which is changing factor to character since version broom version 0.5.6.
Well done, the culprit was found. Have you thought of a workaround? Might a fast solution (surely not optimal, but useful in the meantime you release the fix) be to include in your package the old version of the function changed after having renamed it?
fixed now, thanks for reporting the issue and for your reproducible scripts. Main changes done at: https://github.com/kassambara/rstatix/pull/170
You need the latest dev versions of ggpubr and rstatix
Hi Alboukadel, I have installed R 4.2.2 with ggpubr and rstatix. It doesn't work.
I checked the version of the installed packages and got:
> packageVersion("ggpubr")
[1] ‘0.5.0’
> packageVersion("rstatix")
[1] ‘0.7.1’
I read on your link that rstatix 0.7.1.999 is needed, but I can't find it. What exactly should I do?
Please consider installing the latest dev version of the two packages as follow:
devtools::install_github("kassambara/rstatix")
devtools::install_github("kassambara/ggpubr")
After installation, everythyng should work. Please let me know if this is not case.
Thanks! Now it is OK.
I made this plot some time ago.
As you can see, the brackets were perfectly positioned.
Today, after updating some packages - among which maybe ggpubr too - I have used the same code to redo the plot, and this is the result:
Have you some idea about what happened? It seems to me that the position computed for the first group has been used for the bracket of the third one, the position for the second has been used for the first group and that of the third group ended up below the second one...
Thanks.