Decide on organizing exemplars by experimental design

chatchavan commented 6 years ago

From Steve (21.12.17)

I was thinking about the structure of the exemplars, and I wonder if it'd be more clear to name the exemplars using the type of data/experiment rather than just the reported statistic. Basically, structure the sections using what the reader already has (the data and experiment design) rather than what they don't know yet (how to report the analysis).

That would give us sections for:

Between Subject - Single Independent Variable
Between Subject - Multiple Independent Variables
Within Subject - Single Independent Variable
Within Subject - Multiple Independent Variables
Mixed Within and Between - Multiple Independent Variables

Then have simple ES, standard ES, and maybe Bayesian as subsections. (Or maybe a flipped hierarchy?)

It would help us see what we haven't yet addressed, and it would emphasize that choosing an exemplar to copy shouldn't be arbitrary. As I've said before, I'm most worried about people following an exemplar for a type of data that isn't appropriate.

From Pierre (21.12.17)

I think it's a good breakdown and the flipped version would work as well (since the doc is about Es), but it also looks a bit too much like a textbook-type of structure. Three possible drawbacks I see:

1) it may cover too much, since the topic is Es and not "how to analyze data from your experiments".

2) this type of structure can be easily interpreted as a recipe or a decision tree where people just have to pick their analysis depending on their type of experiment. Might discourage thinking.

3) there are other structures we may want to emphasize as well, e.g how ES interpretation depends on experiment context

That's why I also like the "one simple example, one complex example" approach. It's incomplete so it will force people to think. It also yields a shorter document.

I think it's fine for exemplars not to be exhaustive, the name kind of implies it.

From Matt (21.12.17)

Between Subject - Single Independent Variable

Between Subject - Multiple Independent Variables

Within Subject - Single Independent Variable

Within Subject - Multiple Independent Variables

Mixed Within and Between - Multiple Independent Variables

Now that I have thought more about this, I wonder if some of this set of exemplars is more appropriate for a "study design" guideline or maybe a "repeated measures" guideline.

For example, the difference between showing simple effect size calculation for "Between Subject - Single Independent Variable" and "Within Subject - Single Independent Variable" is really just whether you're using a paired or independent samples t test. Otherwise it is the same report: the mean difference with a confidence interval. So within the context of the Effect Size guideline it might be redundant, but if we had a Repeated Measures guideline with a "Within Subject - Single Independent Variable" exemplar that describes why you need a paired t test in that case, we could refer people to that exemplar from the "Simple Effect Size" exemplar in the Effect Size guideline.

Trying to generalize this intuition, I think that one criteria that might help us cut down on the combinatorial explosion of possible exemplars we could put into a specific guideline is to focus on a set of exemplars that differ from each other in ways that are directly relevant to the fundamental concepts in that section, and to either omit other exemplars, move them to an appendix, or move them to another guideline.

This means, for example:

We would not include two different exemplars under Effect Size that calculate a simple effect size using either frequentist or Bayesian approaches, because these differ by inferential approach but are essentially the same with respect to the effect size calculated (the topic of this guideline). Instead we would push one of those examples to the appendix (as now) or to another guideline (e.g. one on Bayesian estimation or on inferential statistics or what have you).
We would not include two different exemplars under Effect Size that calculate a simple effect size on a design with one independent variable and either a between subjects or within subjects design, because these differ by experimental design but the effect size is still just a mean difference. Instead, I would suggest perhaps making a Repeated Measures guideline and putting the Within Subject - Single Independent Variable exemplar there. A Repeated Measures guideline might be nice anyway because there's lots to talk about there.
We could still include different exemplars showing cohen's d (e.g. something like Pierre's paper at vis --- putting multiple outcomes from the same experiment into context) and partial eta squared (something like Steve's existing exemplar), since those exemplars differ by effect size and are within the Effect Size guideline. I would suggest keeping those examples as simple as possible in terms of experimental design, and putting more complex examples into an Experimental Design or Repeated Measures guideline.

i.e.: Exemplars within a guideline should differ primarily along dimensions strongly related to the core topic of that guideline.

chatchavan commented 6 years ago

I agree with Matt that the exemplars don't need to be exhaustive and should "differ primarily along dimensions strongly related to the core topic of that guideline"

However, I still think that Steve's proposal of by-experimental-design structure still have two benefits:

As Steve mentioned, it could prevent applying a wrong technique/measurements/functions to the analysis.
It allows contributors to see the gap in analysis combination that are not covered by any exemplars. This could lead contributors to think if said combinations need special treatment.

I'd propose to use Matt's rule of thumb and add a section at the end of exemplar indexing them by experimental designs. For example, the structure of the effect size chapter would look like the following:

2 Effect size 2.1 FAQ 2.2 Exemplar: Simple effect size 2.3 Exemplar: Within-subjects experiment 2.4 Exemplar: Standardized effect size 2.5 Exemplar: Nonparametric effect size 2.6 Index of exemplars by experimental design

This would still present one strong thread throughs FAQ and the sequence of exemplars. It also make the decision tree much less prominent to address Pierre's comment that having a decision tree "might discourage thinking".

mjskay commented 6 years ago

I like this idea. It could potentially expand into something more generic like an "index of related exemplars" or something like that, and point to related exemplars in other topics. That way if an exemplar that happens to use effect sizes is in another topic, we could point to it there.

transparentstats / guidelines

Decide on organizing exemplars by experimental design #71