Open rdstern opened 6 years ago
If your calculation returns a single value you can use dplyr
, e.g.
survey %>% dplyr::group_by(Village.) %>% dplyr::summarise(t.test(Field, Size)$p.value)
# # A tibble: 4 x 2
# Village. `t.test(Field, Size)$p.value`
# <fctr> <dbl>
# 1 KESEN 0.5246753
# 2 NANDA 0.0008996
# 3 NIKO 0.6953557
# 4 SABEY 0.0022165
works in the calculator.
In the script window this would be:
survey <- InstatDataObject$get_data_frame(data_name="survey")
survey %>% dplyr::group_by(Village.) %>% dplyr::summarise(t.test(Field, Size)$p.value)
rm(list=c("survey"))
To get the full output for a statistical test you can use by
:
by(survey, survey$Village., function(x) t.test(x$Field, x$Size))
# survey$Village.: KESEN
#
# Welch Two Sample t-test
#
# data: x$Field and x$Size
# t = 0.66, df = 12, p-value = 0.5
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# -2.162 4.019
# sample estimates:
# mean of x mean of y
# 5.143 4.214
#
# ------------------------------------------------------------
# survey$Village.: NANDA
#
# Welch Two Sample t-test
#
# data: x$Field and x$Size
# t = 3.9, df = 19, p-value = 9e-04
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# 4.86 16.00
# sample estimates:
# mean of x mean of y
# 15.071 4.643
#
# ------------------------------------------------------------
# survey$Village.: NIKO
#
# Welch Two Sample t-test
#
# data: x$Field and x$Size
# t = -0.41, df = 7.6, p-value = 0.7
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# -4.033 2.833
# sample estimates:
# mean of x mean of y
# 4.8 5.4
#
# ------------------------------------------------------------
# survey$Village.: SABEY
#
# Welch Two Sample t-test
#
# data: x$Field and x$Size
# t = 4.1, df = 10, p-value = 0.002
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# 3.361 11.439
# sample estimates:
# mean of x mean of y
# 11.4 4.0
That's great - already! Now if we add that to my new suggested dialogue in #4385 - no longer in the calculator - then we have that feature in at least one dialogue - which can be a stepping-stone for those who want to use the feature elsewhere. Then I think the script window can provide (at least initially) some of the looping we need.
Is there any way we can currently give the by command in R-Instat. I am thinking initially with a command we type into the calculator or in the script window. That sort of gives the same power for functions (like the statistical tests) that the summary dialogues gives for data manipulation.
The information says it is to "Apply a Function to a Data Frame Split by Factors"