Closed TomHardwicke closed 1 year ago
@TomHardwicke these comments were immensely helpful. I did a full rewrite of the chapter, including changing all text in several sections and moving down the shepard example to the last section. I'd love to hear your thoughts on the rewrite.
super, much clearer!
I'm working through the book sequentially and these are my comments on chapter 2. I've also made some direct edits to the text for more minor things and writing adjustments and issued a pull request.
Overall I feel like this chapter needs some work. I think I'm a relatively naive reader, having not read a great deal or had much training on the subject of theory. This is why I've not offered many constructive suggestions below and just highlighted problems (sorry!); I've tried to be as concrete as possible about the issues I encountered.
Our initial definition of theory seems a bit too narrow or is missing the mark slightly — our definition implies that theories are equivalent to hypotheses, whereas I we later seem to suggest that hypotheses are components of / consequences of a broader explanatory theory. We also suggest theory is different to an applied hypothesis, but our definition of theory seems to fit well to our example of an applied hypothesis. [Specifically, we say "And sometimes you want to know the answer to a specific applied question, like “will giving a midterm vs. weekly quizzes lead students in a class to perform better on the final?” But more often, our goal is to create theories that help us explain and predict new observations. What is a theory? We’ll argue here that we should think of psychological theories as causal hypotheses – that is, hypotheses about the causal structure of the mind and about the causal relationships between the mind and the world."]. I don't have a strong view about how we should define these things, but wanted to highlight that the current presentation seems too narrow and partly contradictory to me.
I don't think the Shepard's universal law of generalisation case study box is working too well — firstly, I think the description of the study is a bit confusing to me . It seems that similarity is being manipulated (e.g., size of light circle), but we also talk about it being measured. I think we need to expand on what is meant by psychological similarity (and is that the same as psychological 'distance'? Confusing to use two terms for the same thing). We also say that animal's response is being measured, and that's taken as an indicator of generalisation. But later, we say "the measure was an explicit similarity judgment". So are there two measures or one? I think its quite confusing. Perhaps we just need to describe it more clearly or perhaps we need a different example — it could be that the MDS approach is in itself a complicated thing to explain, which is getting in the way of illustrate some more basic ideas about theories. Secondly, it's not clear to me what intuition we're trying to pump or what principle we are trying to illustrate. We say it's "an example of inductive theory building" — but we don't discuss inductive theory building anywhere else in the chapter. I also wonder if we can cut this entirely and just stick to the hypothetical money - happiness example that we already built up in the previous chapter. Or at least flip things, so we explain the basic concepts using money - happiness, and then later explain how these concepts manifest in the real world example of Shepard's theory.
Figure 2.1 caption, perhaps add a little more explanation (proposed change highlighted in bold) e.g., "Figure 1 from Shepard (1987). Generalization gradients for twelve different kinds of stimuli. As the psychological distance (dissimilarity) between stimuli increases, the probably of generalisation decreases exponentially. In other words, when an animal learns a behaviour that is appropriate in a given context, it is more likely to repeat that behaviour in other similar contexts, with the likelihood decreasing with decreasing similarity between contexts."
Fig 2.2. "A schematic of what a theory might look like." and Fig 2.3 a "nomological network"— these look like DAGs; perhaps clarify why they are similar/different because we learned about DAGs in the previous chapter. Note also that in this chapter we are calling things like money and happiness constructs whereas in chapter 1 we called them variables.
"One way to sketch this kind of network would be to use the kind of causal graph we used above. So then a nomological network looks like a causal model" - we're now introducing yet more terms for the same/similar things - causal graph, nomological network, causal model. I suggest we make the terminology consistent and where different terms are needed, but describe similar things, be explicit about what the differences are.
"We’ve proposed that a psychological theory is a set of causal relationships among different constructs" - no we haven't! At least not in that language. Perhaps earlier we need to say: "What is a theory? We’ll argue here that we should think of psychological theories as a set of proposed causal relationships among different constructs – that is, hypotheses about the causal structure of the mind and about the causal relationships between the mind and the world." (emphasis added to highlight changed text)?
"Since we can’t directly observe the workings of the human mind, for many psychological problems, defining the constructs is already difficult." Perhaps use money and happiness as contrasting examples here — its (relatively) easy to define what money is because we can directly observe it; but its much harder to say what happiness is.
"Since constructs are not observed directly" — only psychological constructs? Or, see point above, perhaps we need to clarify that all constructs are not directly observable, but some are easier to measure because they have obvious measurable referents in the external world; whereas psych constructs do not?
The Hempel quote - "A scientific theory might… be likened to a complex spatial network... [etc]" is poetic, but I think its more likely to add confusion than clarity.
"the arrow connecting distance and generalization" refers to figure 2.3 where we use the term "similarity", not distance
"very specific parametric form" Parametric = jargon. "Mathematical relationship" is maybe easier to understand?
Sidenote 2: "Calling the theory a “network” sounds like it’s a structural equation model (SEM) where there are circles and lines and the lines represent something akin to the correlations between the numbers in the circles. That’s one way to define a psychological theory, but it’s certainly not the only way!" Perhaps explain exactly how and why an SEM differs? For someone who doesn't know what SEM is, this just seems unhelpful at best, and perhaps adds confusion by introducing yet another term.
"follow the same form" - 'form' might be jargon and 'pattern' is easier to understand? not sure.
"It’s explanatory because it answers questions like “why is generalization so much lower for shapes that are a bit further apart in psychological space?” and “why does psychological generalization have an exponential form?” - its not obvious to me (I'm naive here - pretty much all I've read about this theory is from our own chapter) why the theory provides an explanation rather than just a description. It describes the relationship between similarity and generalisability in a mathematical form, but that's not an explanation is it?
"Explanation is an important feature of good theories, but it’s also easy to trick yourself by using a vague theory to explain a finding post-hoc (after the fact)." But Shepherd was trying to explain data after fact wasn't he? It's unclear — from our chapter — what is wrong with that. Why was he potentially tricking himself?
"everything connects to everything" - in the figure (2.4), everything does not connect to everything (I assume we mean connected with a causal arrow
Figure 2.4 - why are there arrows inside the mesosystem? (rather than between constructs)
"what sorts of theories are more likely to explain specific phenomena" this is a bit vague, I think an example would help. e.g., perhaps something like the ecological systems framework helps remind us that a child's behaviour is likely to be influenced by a huge range of factors, such that any individual theory cannot just focus on an individual factor and hope to provide a full explanation of a child's behaviour?
"more ecological model in social work" - ecological = jargon
"individuals’ needs are considered not only as the expression of specific psychopathology" - this seems different to our previous explanation of the framework which seemed to be about childhood development (nothing about needs or psychopathology)
"There’s a continuum between precisely specified theories and broad frameworks." A digram might be helpful here - it could illustrate the hierarchy of frameworks, theories, and hypotheses? Perhaps we could also fit 'models' into that, though perhaps that is a different (computational) expression of a theory
"Strong theory development" - unclear what we mean by 'strong' here - something about the way theories are built? Tested? Their accuracy? Falsifiability?
I think the structure of the chapter could be tightened up a bit and made more explicit with some preempting and signposting. I've just read "Another key feature of a theory is that it makes contact with data" and it feels like there's no central thread to hold on to here, were just shifting somewhat arbitrarily between 'stuff to do with theories'. I wonder whether a sequential structure like (1) The function of theory; (2) Developing theory; (3) Testing theory; would work better?
"Money->Happiness (M->H)" abbreviating seems unnecessary and contrary to the advice we give in our writing chapter about avoiding unnecessary abbreviations :)
"trying to make hypotheses that have increasingly broad scope" - not clear what this means - increasing relative to what?
"observations that are more consistent with a hypothesis" - remove 'more'? Otherwise, more relative to what?
"A second point that makes Popper’s falsificationism a bad match for working scientists is the observation that no individual hypothesis (a part of a theory) can be falsified independently." — I think that's a bit of a caricature of Popper's position, he did discuss auxiliary hypotheses and thought they were relevant. There's perhaps a contrast of 'early' and 'late' popper here, I can remember the historical details. We can perhaps cut this sentence and just say something like Lakatos highlighted an important practical difficulty of falsification...
"In our running example, if giving someone money didn’t change their happiness, maybe we wouldn’t immediately throw out our M->H theory. Instead, the fault might be in any one of our auxiliary assumptions, like our measurement of happiness, or our choice of how much money to give or when to give it." I wonder if we can illustrate this using the DAGs we set up in Chapter 1? I think we should at least say about more about what an auxiliary assumption is and give some concrete examples - it's a bit vague at the moment. From our description it also sounds like the problem of auxiliary assumptions means we can never falsify a theory...
two sentences on Kuhn which seem a bit contradictory/confusing: "scientific revolutions looked nothing like the falsification of a theoretical statement via an incontrovertible observation" and "But normal science is punctuated by periods of crisis when the working assumptions of the paradigm break down." What do we mean by the working assumptions breaking down? Is that not contradictory evidence appearing? We then say "Rather, there will often be a holistic transition to a new paradigm, typically because of a really striking explanatory or predictive success." which suggests there doesn't need to be anything wrong with the current paradigm per se, only that a new paradigm appears to be performing well (better?). Perhaps its also worth pointing out that — as far as I understand it — Popper leaned more towards being prescriptive (this is what scientists should be doing) whereas Kuhn leaned more towards being descriptive (this is what scientists are doing). So even if Kuhn's description of what scientists actually do is correct, it's not necessarily what we should be aiming for. And even if Popper's ideas seem great in principle, it may not necessarily be possible to straightforwardly implement them in practice.
"– even experimental psychology –" - not sure why we say this
"we can do better for scientific hypotheses than we can for swans" - can we be more explicit here? It's not clear why we don't consider the swan thing to be a scientific hypothesis. And we say checking all swans is inefficient but it's not clear what we think a better approach would be — I think we're trying to say that well-designed experiments enable us to be more efficient? (if so I think we can be more explicit / need some rewording). Instead of presenting good experiments as a solution to that efficiency problem, we present a new problem: "Experiments are not equally good at comparing theories".
"Precision is a prerequisite for theory testing. If an experimental measurement is not precise, then it could be consistent with any number of results. Many psychology experiments are designed merely to provide directional evidence and to “reject the null hypothesis” of no difference. Directional evidence is often consistent with many different theories." - I think there are two important and separable issues bundled together in this paragraph and I'm not sure why. It seems were trying to play on the analogy of imprecise = broad implications, but I think that's just confusing as these are conceptually distinct issues (one empirical, one theoretical).
"meaning fully a half of all possible changes in happiness would be evidence for our theory." seems unnecessary / unpersuasive to me - the problem of a vague hypothesis is already clear?
terminology consistency "risky tests" vs "risky predictions"
"Theories should make risky predictions" in this section I think its clear why a risky prediction is more impressive, but its less clear how were supposed to achieve this in practice - isn't our ability to make risky predictions heavily constrained by our state of knowledge? e.g., in the money - happiness case, maybe the directional prediction is the best we can do? How do we get to the point where we can make a riskier prediction? (presumably it doesn't make sense to just do this arbitrarily - risky for the sake of being risky - there needs to be some empirical grounding?). Perhaps a little sidetone might help - "What comes first, the theory or the data?" - to illustrate that this is an iterative process?
"As one writer noted, mathematics is “unreasonably effective” as a vocabulary for the sciences (Wigner, 1990) — quotes not giving me much insight (could just be me!)
"In fact, this idea has a long history in statistics (Lindley, 1956)" - and goes even further back to Chamberlains (1890) idea of 'multiple working hypotheses'
"2.6 Models and theories" - can we provide a definition of model?
"There is no one framework that will be right for theorizing across all areas of psychology" - avoid using framework as we're using elsewhere in the chapter in a different way? Use "modelling approach" instead?
"An alternative approach creates statistical models of data that incorporate substantive assumptions about the structure of the data. We use such models all the time for data analysis. The trouble is, we often don’t interpret them as having substantive assumptions about the structure of the data, even when they do (Fried, 2020)! For example, the choice of a linear regression model for data analysis implies a number of assumptions about how measurements are distributed and how they relate to your manipulation. Structural equation models are another example of this approach, where linear regression – in combination with some assumptions about which measurement is related to what – is used to infer the strength of relationships." - I found this quite difficult to follow / extract insight from - unclear what we mean by e.g., "substantive assumptions about structure" or "assumptions about how measurements are distributed"
"If we were drawing our theory" - add cross reference to the DAG in chp 1?
"Linear models are ubiquitous in the social sciences because they are convenient to fit" - can we explain "convenient to fit" in simpler language?
"as theoretical models they are deeply impoverished" - can we say that in less vague language? e.g., they are too simple to capture the complexity of real-world relationships between constructs, which are often non-linear?
"combinations of factors" - align consistency of terminology, see previous use of construct / variable. Otherwise needs a definition I think?
"used to estimate parameters" - don't think we've defined parameters
"Computational or formal artifacts" - what are these?
"Computational or formal artifacts are not themselves psychological theories, but all of them can be used to create psychological theories via the mapping of constructs onto entities in the model and the use of the principles of the formalism to instantiate psychological hypotheses or assumptions" - long sentence and a lot of jargon!
"we characterized psychological theories as a set of causal relationships between latent constructs" - this is the final paragraph and that's our first use of latent (except in fig 2.2, but there's no explanation in the main text) - needs introducing earlier (or cut)?
"presented, tested, confirmed, and falsified" - change to "developed then tested, and confirmed or falsified" ?
"Most modern psychological theories are more like a combination of core principles, auxiliary assumptions, and supporting empirical assumptions" - can we align this better with the language we've used in the chapter? It's not clear what a core principle is (in a technical sense) or an empirical vs auxiliary assumption.
"A productive “research program” is one where the core principles are being adjusted to describe a larger base of observations (Lakatos, 1976). This idea sounds right to us. The best theories are always being enlarged and refined in response to new data." - though perhaps we want to distinguish here from what Lakatos called a 'degenerating' research program, where you are constantly making ad-hoc tweaks to the theory in order to accommodate date. A 'progressive' program, on the other hand, will maintain its explanatory power of existing phenomenon, and new additions to the theory must make empirically testable predictions.
in "discussion questions": "Are the links between constructs well-specified?" - I'm not sure we have anything in the chapter about what this would look like.