jmboehm / RegressionTables.jl

Journal-style regression tables
MIT License
134 stars 18 forks source link
economics regression regression-tables

dev stable Build Status codecov.io DOI

RegressionTables.jl

This package provides publication-quality regression tables for use with FixedEffectModels.jl, GLM.jl, GLFixedEffectModels.jl and MixedModels.jl, as well as any package that implements the RegressionModel abstraction.

In its objective it is similar to (and heavily inspired by) the Stata command esttab and the R package stargazer.

Table of Contents

Installation

To install the package, type in the Julia command prompt

] add RegressionTables

A brief demonstration

using RegressionTables, DataFrames, FixedEffectModels, RDatasets, GLM

df = dataset("datasets", "iris")

rr1 = reg(df, @formula(SepalLength ~ SepalWidth + fe(Species)))
rr2 = reg(df, @formula(SepalLength ~ SepalWidth + PetalLength + fe(Species)))
rr3 = reg(df, @formula(SepalLength ~ SepalWidth * PetalLength + PetalWidth + fe(Species)))
rr4 = reg(df, @formula(SepalWidth ~ SepalLength + PetalLength + PetalWidth + fe(Species)))
rr5 = glm(@formula(SepalWidth < 2.9 ~ PetalLength + PetalWidth + Species), df, Binomial())

regtable(
    rr1,rr2,rr3,rr4,rr5;
    render = AsciiTable(),
    labels = Dict(
        "versicolor" => "Versicolor",
        "virginica" => "Virginica",
        "PetalLength" => "Petal Length",
    ),
    regression_statistics = [
        Nobs => "Obs.",
        R2,
        R2Within,
        PseudoR2 => "Pseudo-R2",
    ],
    extralines = [
        ["Main Coefficient", "SepalWidth", "SepalWidth", "Petal Length", "Petal Length", "Intercept"],
        DataRow(["Coef Diff", 0.372 => 2:3, 1.235 => 3:4, ""], align="lccr")
    ],
    order = [r"Int", r" & ", r": "]
)

yields

----------------------------------------------------------------------------------------------------
                                          SepalLength                 SepalWidth    SepalWidth < 2.9
                            --------------------------------------   ------------   ----------------
                                   (1)          (2)            (3)            (4)                (5)
----------------------------------------------------------------------------------------------------
(Intercept)                                                                                   -1.917
                                                                                             (1.242)
SepalWidth & Petal Length                                   -0.070
                                                           (0.041)
Species: Versicolor                                                                        10.441***
                                                                                             (1.957)
Species: Virginica                                                                         13.230***
                                                                                             (2.636)
SepalWidth                    0.804***     0.432***       0.719***
                               (0.106)      (0.081)        (0.155)
Petal Length                               0.776***       1.047***        -0.188*             -0.773
                                            (0.064)        (0.143)        (0.083)            (0.554)
PetalWidth                                                  -0.259       0.626***           -3.782**
                                                           (0.154)        (0.123)            (1.256)
SepalLength                                                              0.378***
                                                                          (0.066)
----------------------------------------------------------------------------------------------------
Species Fixed Effects              Yes          Yes            Yes            Yes
----------------------------------------------------------------------------------------------------
Estimator                          OLS          OLS            OLS            OLS           Binomial
----------------------------------------------------------------------------------------------------
Obs.                               150          150            150            150                150
R2                               0.726        0.863          0.870          0.635
Within-R2                        0.281        0.642          0.659          0.391
Pseudo-R2                        0.527        0.811          0.831          0.862              0.347
Main Coefficient            SepalWidth   SepalWidth   Petal Length   Petal Length          Intercept
Coef Diff                            0.372                      1.235
----------------------------------------------------------------------------------------------------

LaTeX output can be generated by using

regtable(rr1,rr2,rr3,rr4; render = LatexTable())

which yields

\begin{tabular}{lrrrr}
\toprule
                                & \multicolumn{3}{c}{SepalLength} & \multicolumn{1}{c}{SepalWidth} \\ 
\cmidrule(lr){2-4} \cmidrule(lr){5-5} 
                                &      (1) &      (2) &       (3) &                            (4) \\ 
\midrule
SepalWidth                      & 0.804*** & 0.432*** &  0.719*** &                                \\ 
                                &  (0.106) &  (0.081) &   (0.155) &                                \\ 
PetalLength                     &          & 0.776*** &  1.047*** &                        -0.188* \\ 
                                &          &  (0.064) &   (0.143) &                        (0.083) \\ 
PetalWidth                      &          &          &    -0.259 &                       0.626*** \\ 
                                &          &          &   (0.154) &                        (0.123) \\ 
SepalWidth $\times$ PetalLength &          &          &    -0.070 &                                \\ 
                                &          &          &   (0.041) &                                \\ 
SepalLength                     &          &          &           &                       0.378*** \\ 
                                &          &          &           &                        (0.066) \\ 
\midrule
SpeciesDummy Fixed Effects      &      Yes &      Yes &       Yes &                            Yes \\ 
\midrule
$N$                             &      150 &      150 &       150 &                            150 \\ 
$R^2$                           &    0.726 &    0.863 &     0.870 &                          0.635 \\ 
Within-$R^2$                    &    0.281 &    0.642 &     0.659 &                          0.391 \\ 
\bottomrule
\end{tabular}

Similarly, HTML tables can be created with HtmlTable().

Send the output to a text file by passing the destination file as a keyword argument:

regtable(rr1,rr2,rr3,rr4; render = LatexTable(), file="myoutputfile.tex")

then use \input in LaTeX to include that file in your code. Be sure to use the booktabs package:

\documentclass{article}
\usepackage{booktabs}

\begin{document}

\begin{table}
\label{tab:mytable}
\input{myoutputfile}
\end{table}

\end{document}

regtable() can also print TableRegressionModel's from GLM.jl (and output from other packages that produce TableRegressionModel's):

using GLM

dobson = DataFrame(Counts = [18.,17,15,20,10,20,25,13,12],
    Outcome = categorical(repeat(["A", "B", "C"], outer = 3)),
    Treatment = categorical(repeat(["a","b", "c"], inner = 3)))
rr1 = fit(LinearModel, @formula(SepalLength ~ SepalWidth), df)
lm1 = fit(LinearModel, @formula(SepalLength ~ SepalWidth), df)
gm1 = fit(GeneralizedLinearModel, @formula(Counts ~ 1 + Outcome + Treatment), dobson,
                  Poisson())

regtable(rr1,lm1,gm1)

yields

---------------------------------------------
                   SepalLength        Counts 
               -------------------   --------
                    (1)        (2)        (3)
---------------------------------------------
(Intercept)    6.526***   6.526***   3.045***
                (0.479)    (0.479)    (0.171)
SepalWidth       -0.223     -0.223           
                (0.155)    (0.155)           
Outcome: B                             -0.454
                                      (0.202)
Outcome: C                             -0.293
                                      (0.193)
Treatment: b                            0.000
                                      (0.200)
Treatment: c                           -0.000
                                      (0.200)
---------------------------------------------
Estimator           OLS        OLS    Poisson
---------------------------------------------
N                   150        150          9
R2                0.014      0.014           
Pseudo R2         0.006      0.006      0.104
---------------------------------------------

Printing of StatsBase.RegressionModels (e.g., MixedModels.jl and GLFixedEffectModels.jl) generally works but are less well tested; please file as issue if you encounter problems printing them.

Function Reference

Arguments

Details

A typical use is to pass a number of FixedEffectModels to the function, along with how it should be rendered (with render argument):

regtable(regressionResult1, regressionResult2; render = AsciiTable())

Pass a string to the file argument to create or overwrite a file. For example, using LaTeX output,

regtable(regressionResult1, regressionResult2; render = LatexTable(), file="myoutfile.tex")

Main Changes for v0.6

Version 0.6 was a major rewrite of the backend with the goal of increasing the flexibility and decreasing the dependencies on other packages (regression packages are now extensions). While most code written with v0.5 should continue to run, there might be a few differences and some deprecation warnings. Below is a brief overview of the changes:

New Features

Changes to Defaults

There are some changes to the defaults from version 0.5 and two additional settings

Changes to Labeling

Labels for most display elements around the table are no longer handled by the labels dictionary but by functions. The goal is to allow a "set and forget" mentality, where changing the label once permanently changes it for all tables. For example, instead of:

labels=Dict(
  "__LABEL_ESTIMATOR__" => "Estimator",
  "__LABEL_FE_YES__" => "Yes",
  "__LABEL_FE_NO__" => "",
  "__LABEL_ESTIMATOR_OLS" => "OLS",
  "__LABEL_ESTIMATOR_IV" => "IV",
  "__LABEL_ESTIMATOR_NL" => "NL"
)

Run

RegressionTables.label(render::AbstractRenderType, ::Type{RegressionType}) = "Estimator"
RegressionTables.fe_value(render::AbstractRenderType, v) = v ? "Yes" : ""
RegressionTables.label_ols(render::AbstractRenderType) = "OLS"
RegressionTables.label_iv(render::AbstractRenderType) = "IV"
RegressionTables.label_distribution(render::AbstractRenderType, d::Probit) = "Probit"# non-linear values now
# display distribution instead of "NL"

See the documentation for more examples. For regression statistics, it is possible to pass a pair (e.g., [Nobs => "Obs.", R2 => "R Squared"]) to relabel those.

Labels for coefficient names are the same, but interaction and categorical terms might see some differences. Now, each part of an interaction or categorical term can be labeled independently (so labels=Dict("coef1" => "Coef 1", "coef2" => "Coef 2") would relabel coef1 & coef2 to Coef 1 & Coef 2). This might cause changes to tables if the labels dictionary contains an interaction label but not both pieces independently, the display would depend on which order the dictionary is applied (so labels=Dict("coef1" => "Coef 1", "coef1 & coef2" => "Coef 1 & Coef 2") might turn the interaction into either Coef 1 & Coef 2 or Coef 1 & coef2).

custom_statistics replaced by extralines

The custom_statistics argument took a NamedTuple with vectors, this is now simplified in the extralines argument to a Vector, where the first argument is what is displayed in the left most column. extralines now accepts a Pair of val => cols (e.g., 0.153 => 2:3), where the second value creates a multicolumn display. See the examples in the documentation under "Extralines".

For statistics that can use the values in the regression model (e.g., the mean of Y), it is possible to create those under an AbstractRegressionStatistic. See the documentation for an example.

print_result and out_buffer arguments are gone

print_result is no longer necessary since an object is returned by the regtable function (which is editable) and displays well in notebooks like Pluto or Jupyter. Similarly for out_buffer, use tab=regtable(...); print(io, tab).

Other Deprecation Warnings that should not change results