jkcshea / ivmte

An R package for implementing the method in Mogstad, Santos, and Torgovitsky (2018, Econometrica).
GNU General Public License v3.0
18 stars 2 forks source link

What is the value of splinesobj$splinesinter in genGammaSplines function? #167

Closed jongohkim91 closed 4 years ago

jongohkim91 commented 4 years ago

Hello, I am currently having trouble in implementing splines into the MTR functions, especially the part where genGammaSplines function is used to create targetGammas.

From the genGammaSplines function, the value of splinesobj$splinesinter is assigned to inters. https://github.com/jkcshea/ivmte/blob/15fe22c31fe7409676961432e3cce441488cb8f3/R/mtr.R#L742

However, as far as I know, removeSplines function do not return splinesobj$splinesinter. https://github.com/jkcshea/ivmte/blob/15fe22c31fe7409676961432e3cce441488cb8f3/R/mtr.R#L563-L567

Hence, I am getting an error message saying

Error in nonSplinesDmat[subset, l] : subscript out of bounds

whenever this part of code in genGammaSplines is being run https://github.com/jkcshea/ivmte/blob/15fe22c31fe7409676961432e3cce441488cb8f3/R/mtr.R#L802-L805 Am I missing something?

jongohkim91 commented 4 years ago

I solved the issue. The problem was I was directly using ivmteEstimate function, not using the function via ivmte function. Thus I totally missed a big chuck of code stated below. https://github.com/jkcshea/ivmte/blob/b8fcdc5de35efa0a20933706ceee02d79f3f1e92/R/mst.R#L1560-L1616

jkcshea commented 4 years ago

Glad you figured it out. I'll update parts of the R manual to advise users to use the ivmte() function instead. Currently, the examples in the R manual demonstrate how to use the package in a modular fashion, as you tried to do. This was intended to provide flexibility for power users, but may be misleading in that users may think this is the only way to use the package.

But what you raised is a real issue. The splinesinter element is constructed beginning here: https://github.com/jkcshea/ivmte/blob/b8fcdc5de35efa0a20933706ceee02d79f3f1e92/R/mst.R#L1560 i.e. it is indeed outside of the removeSplines function.

This would be problematic if you indeed had to use the package in a modular way and included splines in your MTR specification. We'll correct this.

jongohkim91 commented 4 years ago

Thank you very much for the quick reply!

Yes I am currently altering some of the embedded functions of your package as my boss wants to conduct an analysis which chould not be directly done by the provided package. For instance, restricting some variables to have the same coefficient derived from the first stage IV-regression on the second stage. Hence, I sometimes run into these errors which are usually due to my alterations or lack of full understainding of how ivmte() fucntion works.

Thanks again for the clarification.

jkcshea commented 4 years ago

Since I'm resolving the other issues at the moment, I thought I'd at least provide a comment on this.

For instance, restricting some variables to have the same coefficient derived from the first stage IV-regression on the second stage.

I assume you mean you want to constrain certain coefficients in your MTR to be the same. If so, this can actually be done quite easily, since the function returns the LP problem (see the returned object $lpresult$model). What you're describing is simply a set of equality constraints in the LP problem. So just add the constraints to the LP model and re-solve it. If you are using a solver besides Gurobi, then you'll have to make some adjustments to $lpresult$model.

A few more details, in case you decide to do this. Suppose

The LP problem will have 2 * S + k0 + k1 variables. The reason for the additional 2 * S variables is that we also need to minimize the criterion function (i.e. how far off we are from matching all the IV-like moments). Since everything is optimized under the l1 norm, we decompose the deviation from each IV moment into its positive and negative components---thus, 2 * S additional variables to optimize over.

Just thought this information would be helpful, in case you have other adjustments that can also be implemented in this simple way.

jkcshea commented 4 years ago

Also, I realize I never answered your actual question (the title of the issue). splinesobj$splinesinter is simply a list of all the variables interacting with each spline. Since R doesn't recognize uSpline() as an operator when parsing formulas, I had to manually construct the interactions for splines when creating design matrices. splinesobj$splinesinter was simply how I organized everything.

jkcshea commented 4 years ago

Sorry this took a while to get to. I wasn't able to move all the code to generate the spline interactions in to removeSplines. So instead, I wrote a function interactSplines.

Here is how it is defined: https://github.com/jkcshea/ivmte/blob/63d10ecd9657eb1b623848a850acc5988fa06b00/R/mtr.R#L948-L975

The only 'special' argument in the function is splinesobj. This is created using the removeSplines function (in hindsight, it is poorly named...): https://github.com/jkcshea/ivmte/blob/63d10ecd9657eb1b623848a850acc5988fa06b00/R/mst.R#L1162-L1163

So as the documentation suggests, interactSplines takes the list of splines in splinesobj, checks which variables each spline is interacted with (by constructing a design matrix, thus the need for the m0, m1, uname, and data arguments), and removes the interactions that are collinear. What the function returns is an updated splinesobj containing the list of interactions for each spline.

jongohkim91 commented 4 years ago

Thank you very much for your quick reply and kind remarks!