Closed beanumber closed 7 years ago
The function is undocumented in the sense that it doesn't have a help function, since we never got around to putting these in a package. It is documented in the lab where it's introduced, or at least it was in the original version of that lab.
Here are the reasons why I like the function:
Current problems with the function:
OK, so first of all in reading this over my initial post sounded far snippier than I intended, so I apologize for that.
Nevertheless in the interest of collegial dialogue I will press on. :-)
I agree with most of your points, and I can see how the clarification of the error messages would be helpful. I also haven't really used the function much, so forgive me if my allegations are off-base.
It's interesting that we both recognize the universality of the inferential process, but take opposite approaches towards emphasizing it. Your approach is to make one function that performs inference in a variety of common settings. My approach is to break the process down into common steps. I have no idea which is more effective! Maybe we could conduct an experiment?
Let me say a little bit more about my perspective on inference. I see it as a two-step process:
So in practice, to do step 1, you have to know what assumptions you are making and how that translates into the sampling distribution. If the sampling distribution is parametric, you have to know the parameters of that distribution. But then step 2 is easy, because it's just p*(x, params)
(or 2*p*(x, params)
, or at worst 2 * (1 - p*(x, params, lower.tail = FALSE)
), where *
is the name of the distribution (e.g. norm
, t
, F
, etc., or data
in the case of a non-parametric distribution), x
is the test statistic, and params
are the parameters of *
. None of these pieces are very hard, and they are all necessary to specify the test correctly. With confidence intervals it's basically the same thing except you use q*()
.
What's nice about this is that it works for any sampling distribution, even ones that you haven't programmed into inference()
. Can inference()
handle randomization distributions like @andrewpbray has written up here? Fisher's exact test? Inference for regression? Correlation? Are you going to update the function every time a new thing pops up?
For sake of discussion, I'm going to narrow the scope to the 8 (means/proportions) x (one/two sample) x (HT/CI) combinations of normal distribution based inferential situations.
In my ideal world, the use of the inference()
function in labs would achieve two goals:
My views on how inference()
fares:
Theoretical goal: While the inference()
function does have unified syntax across the 8 combinations, IMO it does not stress the unified nature/process/framework. Its use is akin to the use of the boxed commands in other software packages: set the dials (in our case 3 of them), enter the data, and get the output. Furthermore, the unified nature/process/framework (the sampling distribution and the test statistic) is conflated with the results.
Applied goal: An anecdote: I had a student who wanted to use inference()
for her own work. She didn't understand that inference()
wasn't part of base R, didn't find it flexible, and found documentation sparse. While the latter two issues can be alleviated with work on our part, this anecdote is illustrative of why most students won't use inference()
beyond this class; they'll either use the long-established built-in tools Ben listed or some other software. If this is the case, then why use inference()
to achieve the applied goal?
I feel these two goals are somewhat at odds with each other and having a two-birds-with-one-stone approach is difficult. Furthermore, I think that trying to have a single function (with ample documentation) cover all bases will only increase its bloat and opaqueness.
My proposal: we modify inference()
to favor the theoretical goal, have as unified looking outputs as possible across the 8 combinations, and make its use the bulk of labs. Then as an appendix we articulate how you would conduct inference in practice using t.test()
, prop.test()
, chisq.test()
, etc. Much like how I tell students that they are going to use normal tables and draw normal curves while taking the intro class, but will never do so in practice later.
I've not even looked at inference()
so I won't comment on how it works or could be modified or whether it should exist at all, but I will put in a plug for not using any functions that are not in a package (even if it is your own package) where they are properly documented and play with the R system -- unless they are truly one-time-use. Students should expect ?inference
and example(inference)
to do something and should not be copying and pasting function definitions from lab documents.
I think we all agree that any function should have a help file that is accessible in an expected way, and the best way to do this is by including the function in a package. That is exactly the current project for all custom functions in the OpenIntro labs, so unless the decision is to scrap inference()
altogether, the documentation problem will be solved within the current project.
The comment on whether student will keep using the function beyond the course is an important one. For me, applicability beyond the intro stat course is one of the most important reasons for teaching R. So I agree that teaching tools within R that might not easily extend beyond the course might be orthogonal to that goal. But if we went down this road for discussion one might say students won't use mosaic
in their research/work either, I don't think this outweighs the benefits of teaching R with a consistent syntax early on.
@beanumber I think we agree on the learning objectives for inference -- I like your two step summary. Also, inference()
doesn't do all the tests you listed, but it does the ones that are in the textbook, since it was written to accompany the methods introduced in the text (a narrow view perhaps, but did the job for the labs).
Here are a few important things that I think the function does well:
t.test
don't do this. I think that visual is important for students to say things like "Ah, the centers of these distributions were close compared to how variable they are (as seen in the side-by-side box plots) so it's not surprising that I ended up with a large p-value (as seen in the sampling distribution sketch)." I think do a bit of EDA before inference and sketch your sampling distribution are important learning goals of intro stats, so a function that by default always shows these visualizations is useful.inference()
uses the same theoretical framework that is used in the textbook:
prop.test
gives $$\chi^2$$ scores, not $$Z$$ scores, but in the textbook students learn to do proportion tests with $$Z$$ scores, and how $$\chi^2$$ and $$Z$$ relate to each other is generally not discussed in intro stats.inference()
we can do an ANOVA as a hypothesis test without lm()
. Later in the book we discuss the relationship between a regression model and ANOVA but using inference()
allows for doing an ANOVA task in the lab at the same time it's introduced in class. (This parity is important for me, and also important for their projects mentioned in (4) below.)inference()
always expects variables (where each element in the vector is an observation in the sample) as input. t.test
also does, but prop.test
doesn't. The inconsistency is annoying.inference
function this resulted in a lot of cases of "R gives me an error I don't understand and I don't know where to go from here", many of which were driven by data type/class issues that we don't discuss in intro stat. Obviously parsing through these is a skill (an important one), but the management of the project was getting way too overwhelming. The custom function that checks for data types, and reports custom error/warning messages has helped in this regard.I'm open to solutions that address these goals. We might come up with our own solution, or there might already be something out there that I'm not aware of.
There is one thing inference()
does that I think/thought is good, but I am not as firm on as the ones listed above: It highlights that the same problem (if certain conditions are met) can be solved via a parametric or simulation-based method. In the function the definitions of many of the arguments are the same, and you just need to flip the method switch. In some sense I think this is good. On the other hand I'm not sure if it helps students understand how the simulation-based method really works.
Notes:
mosaic::xpnorm()
or mosaic::xchisq.tex()
for examples of this approach.) This allows new users to both become familiar with the standard tools and to get the extra behavior you would like them to see each time. Eventually, they may wean themselves of "extras". Another option would be to write plot methods for objects of class "htest".prop.test()
does more than test for a single proportion and that not all of the tests it can do are 1-df tests.mosaic
package fixes this.Note about the phrase "never got around to putting these in a package". This sort of code should be born in a package and incubated and developed there. Then there is not need to later "put it into a package".
Sorry I'm late to the party here. It's been a good read, with some valuable thoughts.
@rpruim I'd think we'd all now agree about best practices for these things. This was less apparent to us in 2011 when we were first putting these things together as grad students. The motivation in moving the old code to a package on github was exactly as you say: to incubate it and develop it, for which this thread is a big help.
Another motivation for this function that I'm remembering: the t.test()
function does the more robust Welch's t-test instead of the vanilla t-test. It was a bit confusing when the students would get different answers from R and when doing it "by hand".
In thinking about a single inference function versus @beanumber 's step-by-step approach, a couple thoughts come to mind. We probably all start these things with the students drawing the sampling distribution, calculating the statistic by hand, then finding a p-value. The step-by-step process follows in that same mold, with something like:
mu <- 3
xbar <- 2.5
s <- .7
n <- 14
SE <- s / sqrt(n)
stat <- (xbar - mu) / SE
pt(stat, df = n - 1) * 2`
While I like the cohesion with the pen-and-paper method, I have some reservations.
lm()
, for example, hides all of it, which I think is ok. @rudeboybert , regarding your thoughts on how hiding the machinery weakens the theoretical understanding, would it be sufficient to have them do a line-by-line version in R before using the black box? I mean, a computer is a powerful black box, right?, and that's kinda why we like it. With great power comes great responsibility and all that. I think part of that responsibility is to go through the exercise once by hand as a proof of concept, and then use whatever model diagnostic techniques you can afterwards to ensure you didn't do anything too reckless.So right now my inclination would be to keep the inference()
function, and view it as the lm()
of the simpler inferential procedures. I like Randy's idea of writing methods for the graphics/diagnostics so that we have a summary.inf()
and a plot.inf
similar to lm()
. Doing this, in addition to getting all the documentation in place will be quite a bit of work, so @mine-cetinkaya-rundel , you would need to be OK with recruiting assistance if you need it.
Also, @beanumber , as an acolyte of Michael Lavine, I feel the need to need to formally object to your characterization of statistical inference as being solely the null hypothesis significance test. Take it back, sir! Take it back!
See (https://github.com/beanumber/oiLabs-mosaic/issues/2)
No offense to whoever wrote this, but I think I hate
inference()
. It's the worst: an undocumented, magic, black box function.I'll offer two potential solutions:
I favor the latter. The question is does this foster or impede understanding of statistical concepts? I argue the latter. Is it realistic to suggest that if you want to do inference, you can just plug into a mysterious function? Or should we be reinforcing a conceptual understanding of inference by breaking the procedures into small steps? And isn't it important that a student specify how she is doing inference?
I don't teach with this anyway. If you want to do a t-test, use
t.test()
. If you want to find a p-value in a normal sampling distribution, usepnorm()
. If you want to find a p-value in a t-distribution, usept()
. If you want to find a p-value in a data-generated sampling distribution, usemosaic::pdata()
.On a slightly more esoteric level, I'm not sure that any of the functions in this package are really useful. I don't mind teaching students about functions in
dplyr
ormosaic
because I know those packages are widely-used and likely to be well-supported in the future. But the function in this package never get used outside of these labs, so are they really necessary?/diatribe