LOST-STATS / lost-stats.github.io

Source code for the Library of Statistical Techniques
https://lost-stats.github.io/
GNU General Public License v2.0
263 stars 167 forks source link

Fix marginal effects plot (with categorical interactions) page #178

Open grantmcdermott opened 2 years ago

grantmcdermott commented 2 years ago

The fixest code is all broken here: https://lost-stats.github.io/Presentation/Figures/marginal_effects_plots_for_interactions_with_categorical_variables.html

(Likely due to changes to i() introduced around version 7.0.0).

The solution is something like:

library(fixest)

od = read.csv('https://github.com/LOST-STATS/lost-stats.github.io/raw/source/Presentation/Figures/Data/Marginal_Effects_Plots_For_Interactions_With_Categorical_Variables/organ_donation.csv')

od = 
  within(od, {
    Date = as.Date(paste(substr(Quarter, 3, 7), 
                         as.integer(substr(Quarter, 2, 2))*3-2, 
                         1, 
                         sep = "-"))
    Treated = State == 'California'
})

fmod = feols(Rate ~ i(Date, Treated, ref = "2011-04-01") | State + Date, 
             data = od)

coefplot(fmod)
iplot(fmod)

But stepping back, I actually think we should change the dataset for this page. It requires several tedious data cleaning steps across the different languages and ultimately ends up producing an event study plot because of the time dimension (which is confusing in of itself). Aren't we just looking for something like

mod = lm(mpg ~ factor(vs) * factor(am), mtcars)
summary(mod)
marginaleffects::plot_cap(mod, condition = "am")

? (And obvs the equivalent in other languages)

NickCH-K commented 2 years ago

Yep, a switch to something that could make use of plot_cap() would make sense to me. Most of the tedium in other languages is due to a lack of something like plot_cap() though.

grantmcdermott commented 2 years ago

For Stata it would be marginsplot though, right? (Or am I missing something?)

I’ll try to update the page later today.

NickCH-K commented 2 years ago

I believe I tried to use marginsplot when I wrote the page but it didn't play nice with this graph format

Also, realized indexing was broken for this page, it was set to Figure not Figures. Fixed that.

grantmcdermott commented 2 years ago

Okay, having thought about it a bit I reckon we just want something like the following:

library(marginaleffects)

# Categorical * categorical
mod = lm(mpg ~ factor(vs) * factor(am), mtcars)
plot_cme(mod, effect = "vs", condition = "am")

# Aside: this is the plot version of...
marginaleffects(mod, 
                variables = "vs",
                newdata = datagrid(am = 0:1))
#>   rowid     type term contrast     dydx std.error statistic      p.value
#> 1     1 response   vs    1 - 0 5.692857  1.651125  3.447866 5.650336e-04
#> 2     2 response   vs    1 - 0 8.621429  1.931478  4.463643 8.057761e-06
#>   conf.low conf.high vs am
#> 1 2.456712  8.929002  0  0
#> 2 4.835802 12.407056  0  1

# Continuous * categorical
mod2 = lm(mpg ~ wt * factor(am), mtcars)
plot_cme(mod2, effect = "wt", condition = "am")

plot_cme(mod2, effect = "am", condition = "wt")

Created on 2022-06-15 by the reprex package (v2.0.1)

Agree? If so, I'll update the page with examples for the different languages. (Note to self: potentially useful Julia refs 1, 2, 3, since StatsPlots doesn't appear to support this.)

NickCH-K commented 2 years ago

Yes, I think that makes sense!