AdaemmerP / lpirfs

40 stars 21 forks source link

Inconsistency in Long Difference Local Projection Implementation #45

Open itamarcaspi opened 1 week ago

itamarcaspi commented 1 week ago

Problem Description

There appears to be an inconsistency between the theoretical specification for long difference local projections (LP) and the current implementation in lp_lin_iv() when cumul_mult = TRUE.

The theoretical setup (as described in Jorda and Taylor, 2024) specifies that for long difference LPs, both the LHS and RHS variables should be transformed into differences:

# Theoretical long difference specification
y(t+h) - y(t-1) = α + βs(t) + γ(y(t-1) - y(t-2)) + controls + ε

However, the current implementation in lp_lin_iv() only transforms the LHS while keeping level lags on the RHS:

# Current implementation 
y(t+h) - y(t-1) = α + βs(t) + γy(t-1) + controls + ε

This can be seen in the code where cumul_mult = TRUE only affects the LHS transformation:

if(isTRUE(specs$cumul_mult)) {
  yy = dplyr::lead(y_lin, (h - 1)) - dplyr::lag(y_lin, 1)  # LHS transformation
  # ... RHS variables remain in levels ...
}

Suggested Solution

  1. Add a new parameter rhs_diff = FALSE (default) to maintain backward compatibility

  2. When cumul_mult = TRUE and rhs_diff = TRUE, transform both LHS and RHS variables to differences:

    • LHS: Keep current y(t+h) - y(t-1) transformation
    • RHS: Transform lagged variables to differences y(t-1) - y(t-2)
  3. Update documentation to:

    • Explain theoretical difference between level vs difference specifications
    • Note which approach better matches different types of DGPs
    • Clarify interpretation differences between the approaches

Example implementation sketch:

if(isTRUE(specs$cumul_mult)) {
  # LHS transformation
  yy = dplyr::lead(y_lin, (h - 1)) - dplyr::lag(y_lin, 1)

  if(isTRUE(specs$rhs_diff)) {
    # RHS transformation - convert lagged variables to differences 
    x_lin_diff = apply(x_lin, 2, function(x) x - dplyr::lag(x, 1))
    xx = x_lin_diff  # Use differenced RHS variables
  } else {
    xx = x_lin  # Use level RHS variables (current behavior)
  }
}

Benefits

  1. Provides theoretically consistent implementation of long difference LPs
  2. Maintains backward compatibility
  3. Gives users flexibility to choose specification based on their DGP
  4. Improves transparency about methodological choices

Let me know if you would like me to explain any part of this in more detail or provide additional code examples.

AdaemmerP commented 1 week ago

Thanks for pointing this out. Jorda and Taylor, 2024 was not written when I implemented the cumul_mult option. Unfortunately, I do not have any time to work on the package but pull requests (including tests) are welcome.

As a workaround, and maybe the simpler solution, the user could calculate y(t-1) - y(t-2) herself and put it into c_exog_data. This in conjunction with cumul_mult = T should give the theoretical long difference specification, right? Maybe one could just add this to the documentation as the simplest workaround? This would also avoid having to include and use another option in the functions.