OpenIntroStat / openintro-statistics

đź“š An open-source textbook written at the college level. OpenIntro also offers a second college-level intro stat textbook and also a high school variant.
https://www.openintro.org/book/os
Other
403 stars 253 forks source link

Pedagogy discussion: alternative sequencing of statistical inference #39

Open rdisalv2 opened 3 years ago

rdisalv2 commented 3 years ago

Hi all,

I'm a college instructor and teach courses in statistics for social science. I have a specific question about pedagogy and ordering of topics in an intro statistics textbook and course. I'm posting this here for two reasons: (1) I believe I'm likely to receive good feedback from the experts working on OpenIntro Statistics (OIS), and (2) I have been hoping to switch my textbook for one of my courses to OIS, and will make a fork of OIS and make the following changes if I'm still convinced they'll improve pedagogy.

For background, the strategy to explain inference in OIS appears to be:

1) Present repeat-sample inference for proportions (in chapter 5), central limit theorem, confidence intervals and hypothesis tests for proportions using the normal distribution.

2) Then, in chapter 7, introduce that the sample mean is approximately normal in large samples if sigma is known. Then in 7.1.3 talk about when sigma is not known, the t-distribution is used instead. Then turn to explaining confidence intervals & etc with t-distributions.

This is a very common approach to statistics instruction. I was curious what you thought about an alternative approach:

1) Present repeat-sample inference for means of numeric variables first, and introduce the central limit theorem, confidence intervals and hypothesis tests for these. (Start with means rather than proportions.)

2) Explain that if the sample size is large enough, the normal distribution approximation still works well even if the sample standard deviation is plugged in for sigma. Thus avoid introducing the t-distribution at this point, and keep emphasizing the normal distribution approximation works for large samples. Have exercises where students use the normal distribution but plug in s for sigma, rather than using the t-distribution. Call these "large sample confidence intervals and hypothesis tests."

3) Then point out that proportions are means (the mean of an indicator variable is a proportion), so that all the methods introduced for means can be used for proportions. This basically skips all the proportions analysis in a traditional stats course (binomial distribution etc.), and I think communicates the essence of what is done in practice for proportions. (The advantage of using the explicit formula for the variance for proportions I think is a smaller standard error, but this advantage disappears when n -> infinity. There are of course problems with CLT when either np or n(1-p) are small, but that happens when the sample size is small and the variable is very skewed (an indicator variable skewed when it is very often 0 or 1). Small sample size and/or very skewed variables also lead to CLT problems with numeric variables, so this also can be seen as a special case of what happens in the case of means.)

4) Then in an "In Practice" chapter, talk about (a) the t-distribution improvement (which is dropped in for the normal distribution), (b) the specific proportion standard deviation formula (and improvements that result from using it rather than the usual formula for s), and (c) anything else skipped because of the restructuring.

A related thought I've had is to only talk about two-tailed hypothesis tests during points (1) through (3) above, then talk about one-tailed hypothesis tests only in point (4). (In social science folks practically always use two-tailed tests anyway, I think one-tailed tests are kind of a distraction during the period where the "meat of statistical inference" is being introduced.)

In general I think intro stats courses might introduce too much too quickly, so I have been thinking about this approach which makes means and the normal distribution a common thread as intuition about statistics is being built, until the end where some practical adjustments are provided. I would be grateful for any thoughts on this approach! (I'm especially interested in what could go wrong.)

DavidDiez commented 3 years ago

Hi Richard,

Here are my current thoughts:

Ultimately, there's going to be flaws in any approach, so if there's a way you are most excited about introducing something, that enthusiasm is also important. If your students will get excited about learning because you're excited about the way material is introduced, I imagine that they'll put more effort in and ultimately learn more. That is, consider the comments above as considerations, not claims that one way is strictly better for every instructors and classroom.

I hope that helps!

Best, David