Dummy Variable Coefficients

lecy commented 2 years ago

We are primarily using discrete estimators this semester. What is a discrete estimator, you might ask? That just means your independent variable would be a group (treatment vs control), not a level of treatment (mg of caffeine).

The confusing thing is that our regression models would look the same. For example, study participants are randomly assigned a level of caffeine from 0 to 400mg in the first case, and we have a treatment group that receives 200mg of caffeine and a control that receives 0mg in the second case.

We are interested in whether caffeine increases heart rate. Our two models would look like this:

(1) heart.rate = b0 + b1*caffeine + e
(2) heart.rate = b0 + b1*caffeine_dummy + e

In both cases b0 would represent the heart rate of the group that receives 0 mg of caffeine. In the second case that would be the average heart rate of the control group.

In model (1) b1 will represent the slope - how much will HR increase for each additional mg of caffeine.

In model (2) b1 will represent the group difference in average heart rate between the control and treatment groups.

Specifically, in model 2 we arrive at group means as follows:

b0 = ave heart rate of control group 
b0 + b1 = ave heart rate of treatment group

Then if we want to calculate the effect it would be:

effect = T2 - C2 = (b0+b1) - b0
effect = b1 + (b0-b0)
effect = b1

So the coefficient b1 captures the group difference when our independent variable is discrete (a dummy variable).

Note that in the dummy variable case the coefficient b1 is mathematically equivalent to a t-test of the group differences. When using discrete study groups and models with dummy variables the regressions will give us differences in group means.

For a helpful review of some of these concepts see:

https://github.com/Watts-College/cpp-524-fall-2021/raw/main//pubs/hypotheses-tests-with-dummy-variables.pdf

https://ds4ps.org/cpp-523-spr-2020/lectures/dummy-variables.html

https://github.com/Watts-College/cpp-523-fall-2021/issues/12

lecy commented 2 years ago

If b1 is a measure of effect in the second model, what would be our effect size in the first model where b1 represents the slope that captures the relationship between the level of caffeine and heart rate?

Discrete case:

effect = T2 - C2 = (b0+b1) - b0
effect = b1 + (b0-b0)
effect = b1

Levels case:

(1) heart.rate = b0 + b1*caffeine + e

droach7 commented 2 years ago

I believe the effect size of model 1 would vary with treatment dosage and be represented by b1(caffeine dosage).

Example: (1) heart.rate = b0 + b1*caffeine + e

Say bo = 90, b1= 0.1, and the treatment group receives 200mg of caffeine

In that treatment group, the average heart rate would now be: 90 + 0.1(200) = 110. The effect size would be b1(dosage) = 0.1(200) = 20. Thus, the 200mg group could expect a treatment impact of an additional 20 bpm added to the control group's baseline heart rate of 90pm.

Similarly, the 400mg treatment group would have an average heart rate of 130bpm, and an effect size of 0.1(400) = 40.

I believe this model's reliability would be dependent on whether the control group's heart rates are equivalent to the treatments group's heart rates independent of treatment (aka in the absence of caffeine)? If the treatment group's average heart rate was 10 bpm above the control group's (100 instead of 90 bpm), then the impact of caffeine would be overestimated.

ebossert commented 2 years ago

@droach7 so are you saying the effect size would still just be b1? I reviewed my notes from CPP 523 and that's what I had written: "the expected value of the program is represented by the point estimate of the slope (b1)." I thought that since it is the slope that the total effect would be a b1 change for every one-unit change in x.

droach7 commented 2 years ago

I thought that since it is the slope that the total effect would be a b1 change for every one-unit change in x.

I am saying that effect size is the "b1 change for every one-unit change in x". In my example, if the x = caffeine dosage at 200 or 400 mg, the effect size will vary dependent on the dosage (from 20 to 40 bpm). So the "total effect" or effect size would be a b1 change for every one-unit change in caffeine dosage (b1 multiplied by caffeine dosage amount; e.g. 0.1 x 200 = 20).

Does that make sense? Sorry if I wrote it in a confusing way.

ebossert commented 2 years ago

Yes that makes sense. So in total the effect size would be b1*caffeine, whereas for the dummy variable example it would just be b1?

droach7 commented 2 years ago

Yes, I believe so. That is how I interpreted it at least.

lecy commented 2 years ago

They key here is that regressions create slopes that correspond to a one-unit change in X.

However, one unit of X is rarely a meaningful or realistic intervention or dose. One mg of caffeine is a tiny amount.

The effect attempts to capture the change in Y associated with a typical or reasonable change in X instead of a one-unit change.

In observational studies the effect calculation often uses a standard deviation or interquartile range to determine a typical or reasonable change in X:

effect = b1 * sd(x)
effect = b1 * ( q75_x - q25_x )

In an experiment like this the standard deviation and inter-quartile range are meaningless because we assigned the levels of caffeine ourselves in the experiment.

It would be better to choose a reasonable X, for example the caffeine in the typical serving size of coffee.

To correspond with the discrete treatment group, which uses a dosage of 200mg (approximately the amount of caffeine in a large cup of coffee) we would use 200mg.

HR = b1 * (200mg)

Or using the numbers suggested by @droach7.

# control group
HR = b0 + b1 * (0mg)
HR = 90 + (0.1)*(0) = 90

# treatment group
HR = b0 + b1 * (200mg)
HR = 90 + (0.1)*(200) = 110 

# effect
b1 * (200mg) = 20
T2 - C2 = 110 - 90 = 20

We would expect the discrete model and the levels model using x=200 to arrive at similar heart rates. In practice it would depend on how well the levels model fits the data and if there are outliers in either model.

It is much more meaningful to report effects than raw slopes! The slope is not very meaningful unless you know the underlying variance (or standard deviation) of X.

lecy commented 2 years ago

Yes that makes sense. So in total the effect size would be b1*caffeine, whereas for the dummy variable example it would just be b1?

(1) heart.rate = b0 + b1*caffeine + e
(2) heart.rate = b0 + b1*caffeine_dummy + e

In model (1) b1 is the slope, in model (2) b1 is the discrete treatment of 200mg. So if you were comparing models then you would use b1*200 in the first case and just b1 in the second case since everyone in the treatment group received a dosage of 200mg.

When trying to compare studies pay attention to the dosage. The treatment dosage in each study might vary widely!

Also when you see a b1 coefficient in the model pay attention to whether it is a slope or intercept adjustment (dummy variable version).

Watts-College / cpp-524-fall-2021

Dummy Variable Coefficients #22