I'm working through the book sequentially and these are my comments on chapter 9. I've also made some direct edits to the text for more minor things and writing adjustments and issued a pull request.
"These conditions instantiate specific factors of interest." <- vague, unclear what a 'factor' is
"fully crossed two factors" <- It might be not be obvious what 'fully crossed' means and a visualisation might help e.g., a simple 2x2 table. This is also an opportunity that people often refer to experimental designs in a way that's analogous to a table (2x2 design, 3x4 design etc). Also further down the page we say "These factors are fully crossed: each level of each factor is combined with each level of each other" but maybe we should move that up to here when the concept is first encountered.
Fig 9.2 there's a few abbreviations in the figure, I added definitions to the caption in my pull request, but I don't know what CRs means
"we’ll try to stay consistent by describing an experiment as a relationship between some manipulation in which participants are randomly assigned to an experimental condition to evaluate its effects on some measure." <- note that we used 'treatment' rather than 'manipulation' a fair bit in the stats chapters.
after saying we'll be consistent with terminology we immediately start talking about factors rather than manipulations! I
Participants (N=10) — yikes...
Somewhat confusing arrangement of figures and captions here:
"An alternative approach is simply to acknowledge the possibility of carry-over type effects and plan to analyze these within your statistical model " <- and counter-balancing?
Figure 9.7 unclear why some things are underlined
"The simplest way you can do a repeated measures design is by administering your treatment and then administering your measure multiple times. This scenario is pictured in a between-participants design in Figure 9.8." <- I got a bit confused here because I've sometimes seen "repeated measures" design used synonymously with "within-subjects" design, but here were talking about taking multiple measures in a between subjects design. Perhaps say something about repeated measures within a condition vs across conditions?
"the effect is very replicable with that particular set but not generalizable across other sets" <- that's kind of weird isn't it?
dose-response design <- presumably has implications for power? More participants / measures needed to compensate?
Fig 9.11 caption is incomplete: "Confounding order and condition leads to"
"When we designed a within-participants experiment, we introduced an order confound: if Dylan was always played first, then we didn’t know whether a change in our measure was caused by Dylan directly" <- A surprise appearance from Bob Dylan here (I think this might be from an old example that we've since changed).
Figure 9.14 axes labels are not legible
"N-back task" <- add brief sidenote to explain what this is?
"This finding spurred immense interest in the scientific community, but the consensus did not support these early findings" <- consensus of what? Scientists/evidence?
"convincing evidence for far transfer" <- unclear what far transfer is relative to the kind of transfer previously discussed
"active control group" <- we've not said what this is — need to clarify it is akin to a placebo group as we discussed above?
"The first advertised that “numerous studies have shown working memory training can increase fluid intelligence” (placebo group)" <- although this is supposed to represent a placebo, in the context of this experiment's design, its not correct (or at least confusing) to called it a placebo group? There's an intentional 'active' component to the manipulation.
"using a cover story to mask the purpose of an experiment" <- maybe an opportunity to refer back to the ethics chapter here — how do we reconcile informed consent with this design issue? Answer (I think) is that ethics often takes precedent, unless we have good reason to think our manipulation is the kind of thing participants commonly encounter in everyday life, and then it may be considered acceptable to temporarily mask it. And also if there's any initial deception it must be revealed in the debrief.
"experimenter expectancy effects" <- perhaps we have enough on this, but another nice example is the Doyen et al. replication (old people priming) which found that the measurement of the DV was impacted by experimenter's beliefs (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0029081) [not sure if there's a different name for expectancy effects that have their impact via participants and those that impact measurement directly?]
"In psychology, the most common modern protection against expectancy is the delivery of interventions by a computer platform that can give instructions in a coherent and uniform way across conditions" <- this protects against reactivity from the direct presence of the experimenter, but not indirect effects (like just figuring out what the hypothesis/intervention is and trying to please the experimenter or behave 'as expected', even if you never meet them).
"In the the implicit theory of mind case study we began with, the stimulus contained an animated Smurf character" <- we didn't mention earlier it was a Smurf, so need to de-Smurf this part, or re-Smurf the case study
"In the the implicit theory of mind case study" <- not sure if we can cross ref to boxes, but if so, include one here?
"external validity" <- perhaps start this section with a more straightforward example. The example we use "Is the effect caused by having the money, or receiving the money with no strings attached?" doesn't seem like the most obvious demonstration of this concept — in fact it seems more like a question of internal validity to me (are we measuring what we think we're measure) rather than external validity (do the findings generalise to other contexts).
I'm working through the book sequentially and these are my comments on chapter 9. I've also made some direct edits to the text for more minor things and writing adjustments and issued a pull request.
"These conditions instantiate specific factors of interest." <- vague, unclear what a 'factor' is
"fully crossed two factors" <- It might be not be obvious what 'fully crossed' means and a visualisation might help e.g., a simple 2x2 table. This is also an opportunity that people often refer to experimental designs in a way that's analogous to a table (2x2 design, 3x4 design etc). Also further down the page we say "These factors are fully crossed: each level of each factor is combined with each level of each other" but maybe we should move that up to here when the concept is first encountered.
Fig 9.2 there's a few abbreviations in the figure, I added definitions to the caption in my pull request, but I don't know what CRs means
"we’ll try to stay consistent by describing an experiment as a relationship between some manipulation in which participants are randomly assigned to an experimental condition to evaluate its effects on some measure." <- note that we used 'treatment' rather than 'manipulation' a fair bit in the stats chapters.
after saying we'll be consistent with terminology we immediately start talking about factors rather than manipulations! I
Participants (N=10) — yikes...
Somewhat confusing arrangement of figures and captions here:
"An alternative approach is simply to acknowledge the possibility of carry-over type effects and plan to analyze these within your statistical model " <- and counter-balancing?
Figure 9.7 unclear why some things are underlined
"The simplest way you can do a repeated measures design is by administering your treatment and then administering your measure multiple times. This scenario is pictured in a between-participants design in Figure 9.8." <- I got a bit confused here because I've sometimes seen "repeated measures" design used synonymously with "within-subjects" design, but here were talking about taking multiple measures in a between subjects design. Perhaps say something about repeated measures within a condition vs across conditions?
"the effect is very replicable with that particular set but not generalizable across other sets" <- that's kind of weird isn't it?
dose-response design <- presumably has implications for power? More participants / measures needed to compensate?
Fig 9.11 caption is incomplete: "Confounding order and condition leads to"
"When we designed a within-participants experiment, we introduced an order confound: if Dylan was always played first, then we didn’t know whether a change in our measure was caused by Dylan directly" <- A surprise appearance from Bob Dylan here (I think this might be from an old example that we've since changed).
Figure 9.14 axes labels are not legible
"N-back task" <- add brief sidenote to explain what this is?
"This finding spurred immense interest in the scientific community, but the consensus did not support these early findings" <- consensus of what? Scientists/evidence?
"convincing evidence for far transfer" <- unclear what far transfer is relative to the kind of transfer previously discussed
"active control group" <- we've not said what this is — need to clarify it is akin to a placebo group as we discussed above?
"The first advertised that “numerous studies have shown working memory training can increase fluid intelligence” (placebo group)" <- although this is supposed to represent a placebo, in the context of this experiment's design, its not correct (or at least confusing) to called it a placebo group? There's an intentional 'active' component to the manipulation.
"using a cover story to mask the purpose of an experiment" <- maybe an opportunity to refer back to the ethics chapter here — how do we reconcile informed consent with this design issue? Answer (I think) is that ethics often takes precedent, unless we have good reason to think our manipulation is the kind of thing participants commonly encounter in everyday life, and then it may be considered acceptable to temporarily mask it. And also if there's any initial deception it must be revealed in the debrief.
"experimenter expectancy effects" <- perhaps we have enough on this, but another nice example is the Doyen et al. replication (old people priming) which found that the measurement of the DV was impacted by experimenter's beliefs (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0029081) [not sure if there's a different name for expectancy effects that have their impact via participants and those that impact measurement directly?]
"In psychology, the most common modern protection against expectancy is the delivery of interventions by a computer platform that can give instructions in a coherent and uniform way across conditions" <- this protects against reactivity from the direct presence of the experimenter, but not indirect effects (like just figuring out what the hypothesis/intervention is and trying to please the experimenter or behave 'as expected', even if you never meet them).
"In the the implicit theory of mind case study we began with, the stimulus contained an animated Smurf character" <- we didn't mention earlier it was a Smurf, so need to de-Smurf this part, or re-Smurf the case study
"In the the implicit theory of mind case study" <- not sure if we can cross ref to boxes, but if so, include one here?
"external validity" <- perhaps start this section with a more straightforward example. The example we use "Is the effect caused by having the money, or receiving the money with no strings attached?" doesn't seem like the most obvious demonstration of this concept — in fact it seems more like a question of internal validity to me (are we measuring what we think we're measure) rather than external validity (do the findings generalise to other contexts).