danielwilhelm / STATA-ME-test

This project provides STATA commands for testing for the presence of measurement error
MIT License
1 stars 1 forks source link

STATA-ME-test

Authors : Young Jun Lee and Daniel Wilhelm

This project provides the STATA command dgmtest which implements the test for significance by Delgado and Manteiga (2001) and can be used to test for the presence of measurement error as described in Wilhelm (2018) and Lee and Wilhelm (2018).

Files contained in this package:

Installation

  1. Download the package.
  2. Change into the directory containing this package.
  3. Use the command dgmtest as described below.

Syntax

The command dgmtest tests the null hypothesis

H0:   E[Y | X, W, Z] = E[Y | X, W]

against the alternative that the null does not hold, where

The vector of explanatory variables, W, may contain elements that enter the conditional expectation in a linear, additively separable fashion. For example, decompose W=(W1,W2) where W1 enters nonseparably and W2 enters in a linear, additively separable fashion,

E[Y | X, W, Z] = f(X,W1,Z) + pi*W2

where f is some function and pi a row-vector of the same dimension as W2. In the presence of variables W2, we apply the test in Delgado and Manteiga (2001) after replacing Y with (Y - pihat*W2), where pihat is Robinson (1988)'s estimator of pi.

Syntax:

dgmtest depvar expvar [if] [in] [, qz(integer) qw2(integer) teststat(string) kernel(string) bootdist(string) bw(real) bootnum(integer) ngrid(integer) qgrid(real)]

where

The options are as follows:

If options are left unspecified, the command runs on the default settings.

Testing for the presence of measurement error

Wilhelm (2018) shows that, under some conditions, the null hypothesis H0 is equivalent to the hypothesis of no measurement error in X. In this context, the variable Z must be excluded from the outcome equation. For example, it could be a second measurement or an instrumental variable. See Wilhelm (2018), Lee and Wilhelm (2018), and the examples below for more details.

Examples

Generate explanatory variables

set obs 200

// true regressor
generate Xstar = runiform()

// measurement error in X
generate etaX = runiform()

// mismeasured regressor
generate X1 = Xstar + 0.5*etaX

// additively linear control variable
generate X2 = runiform()

// measurement error in Z
generate etaZ = runiform()

// second measurement of true regressor
generate Z = Xstar + 0.5*etaZ

// regression error
generate epsilon = runiform()

Generate outcome variable

We generate an outcome in two different ways, in a regression with and without additively separable, linear controls:

// outcome equation without controls
generate Y1 = Xstar^2 + 0.2*Xstar + 0.5*epsilon

// outcome equation with controls
generate Y2 = Xstar^2 + 0.2*Xstar + 0.5*X2 + 0.5*epsilon

Perform the test of no measurement error

Perform the test using default options:

// perform the test of the hypothesis of no measurement error in X1
dgmtest Y1 X1 Z
dgmtest Y2 X1 X2 Z, qw2(1)

Perform the test, choosing the triangular kernel function:

// perform the test of the hypothesis of no measurement error in X1
dgmtest Y1 X1 Z, kernel(triangular)
dgmtest Y2 X1 X2 Z, qw2(1) kernel(triangular)

References

Delgado, M. and Manteiga, W. (2001), "Significance Testing in Nonparametric Regression Based on the Bootstrap", Annals of Statistics, 29(5), p. 1469-1507

Robinson, P. M. (1988), "Root-N-Consistent Semiparametric Regression", Econometrica, 56(4), p. 931-954

Wilhelm, D. (2018), "Testing for the Presence of Measurement Error", CeMMAP Working Paper CWP45/18

Lee, Y. and Wilhelm, D. (2018), "Testing for the Presence of Measurement Error in Stata", working paper available soon