brad-cannell / r4epi

Repository for the R for Epidemiology book
http://www.r4epi.com/
Other
18 stars 50 forks source link

Review and improve the measures of association chapter #107

Open mbcann01 opened 10 months ago

mbcann01 commented 10 months ago

Overview

In the Fall of 2023, I moved over a bunch of stuff from PowerPoint slides (nearly) verbatim. I was in a rush, so I told myself to move it just move it over and improve it later.

Go back, reread, and improve. PowerPoint doesn't always translate perfectly to book format.

Left off

2023-09-25

Tasks

mbcann01 commented 9 months ago

Terms to consider adding

mbcann01 commented 9 months ago

Rothman's investment analogy for absolute vs. relative differences

•Difference measures such as RD and IRD measure the absolute effect of an exposure. It is also possible to measure the relative effect. As an analogy, consider how to assess the performance of an investment over a period of time. Suppose that an initial investment of $100 became $120 after 1 year. The difference in the value of the investment at the end of the year and the value at the beginning, $20, measures the absolute performance of the investment. The relative performance is obtained by dividing the absolute increase by the initial amount, which gives $20/$100, or 20%. Contrast this investment experience with that of another investment, in which an initial sum of $1000 grew to $1150 after 1 year. For the latter investment, the absolute increment is $150, far greater than the $20 from the first investment, but the relative performance of the second investment is $150/$1000, or 15%, which is worse than the first investment.

•Rothman, Kenneth J.. Epidemiology: An Introduction (p. 59). Oxford University Press. Kindle Edition.

mbcann01 commented 9 months ago

Risk difference

Several somewhat technical points about the risk difference measure should be called out here. First, the range of the risk difference is from −1 to +1, inclusive (which we express as [−1, 1]). This is because the highest possible risk is 1, while the lowest is 0; if, as in Table 2.2, the risk in the exposed is higher than in the unexposed, then the risk cannot be higher than 1 − 0 = 1. Similarly, when the risk in the exposed is lower than in the unexposed, then the risk cannot be lower than 0 − 1 = −1. A negative risk difference would occur if the exposure was protective; for instance, if we were considering the association of daily aspirin use with risk of heart attack. Whether the exposure is associated with increased or decreased risk, the risk difference is considered relative to the null value. Again, this is the value which reflects no differences between the two groups being compared. No differences here would mean that the risk in exposed participants and the risk in unexposed participants are the same value P. Therefore, the null for the risk difference is P − P = 0.

mbcann01 commented 9 months ago

Terminology recap

prob_def <- "If some process is repeated a large number of times, $n$, and if some resulting event with the characteristic $Y$ occurs, $m$ times, the relative frequency of occurrence of $Y$, $\frac{m}{n}$ will be approximately equal to the probability of $Y$."
conditional_prob_def <- "The probability that some event occurs given that we know that some other event has already occurred."
Our Term Definition Equation
Probability r prob_def @Daniel2013-qq $P(Y) = \frac{m}{n}$
Conditional probability r conditional_prob_def $P(Y X) = \frac{P(Y \cap X)}{P(X)}$
mbcann01 commented 9 months ago

Predictions

I took this material out in Fall 2023. I may want to add it back in at some point.

Predictions, especially good ones, can obviously be useful on their own. We may know that people of a certain race/ethnicity are most likely to get a particular form of cancer. Knowing that may allow us to concentrate screening efforts more effectively. We may know that older adults who begin to have trouble managing their finances are more likely to develop dementia. We may be able to use that information as an early indicator of important health problems to come.

However, in epidemiology, we are very often not content with predictions alone. It is extremely common for our questions and studies to either directly ask causal questions or imply causal relationships between variables. The reason we are often more interested in causal associations than mere predictions can be found directly in our definition of epidemiology. We want to control health problems. Said another way, we want to know why ”bad” things happen so that we can stop them from happening and/or why “good” things happen so that we can make them happen more often.

This idea is simultaneously so straightforward and so complex. As we will see throughout the semester.

Notice that in the cases above these predictions may be perfectly valid, but do they get us any closer to our ultimate goal of “controlling health problems?” We can’t change anyone’s race or ethnicity, can we? Even if we could, I’m hard-pressed to think of an example of a health outcome that is caused directly by a person’s race or ethnicity. Race and ethnicity are just a proxy for the true unmeasured cause. Likewise, do you really believe that if we hired an accountant to help an older person manage their finances that they would no longer develop dementia? Of course not.

mbcann01 commented 9 months ago

Relative vs absolute difference example

Example