Closed JNing0 closed 4 years ago
Just updated the issue. Please review and comment.
Thanks Jie, I will go through this and get back to you.
Let me propose a model. Why would this not work? Please tell me why this is a bad model
Our sample is indexed as contract-quarter: ct contracts count from c = 1 to c = N and quanters from t = 1 to t = T
Projected_Delay_ct
= \delta
Expedited Pay_ct
+
\Sum_{q=2}^T \tau_q
Quarter_qt
+
\Sum_{k=2}^{N} \gamma_k
Contract_kc
+ e_ct
In this equation, Projected_Delay_ct
is the delay for contract c projected at time t
Expedited Pay_ct
is the indicator that contract c is subject to expedited payment at time t
Quarter_qt
is the indicator that time t equals quarter q
Contract_kc
is the indicator that contract c equals contract k
To add to my comment: is the issue with the lack of control group across the entire horizon, as Jie discusses here: https://github.com/QuickPay-Operational-Performance/Data-and-code/issues/25#issuecomment-627115897?
Then, would running two DiD over two time subsamples help? Again, as Jie discusses above with regarding DiD?
Hi Jie and Vlad, yes I think the key issue is that we don't have any group that is always untreated in the full sample.
I tried to explain this graphically as follows. Suppose we consider the time period from Oct 1, 2009 to Feb 21, 2013. Then, in the figures below, the treatment effect for small businesses (TE1) is fine because large businesses are untreated at that point. But TE2 will be underestimating the effect of the treatment.
- Subsamples 1 & 3: Treatment effect = effect of quickpay on small business when large businesses do not receive quickpay
Hi @JNing0 , a quick question about this point. In subsample 3, small businesses are treated at all times and large businesses are always untreated. So I am not sure if we can run a DiD for this time horizon.
Hi Vlad, yes, the problem with the model is that we don't have a control group. In the model, the time fixed effect \tau_q in the quarter where all contracts receive quickpay will include the treatment effect. So the estimate for \delta will be off. I hope that we can avoid it by not using the fixed time effects for each period and estimating the treatment via the interaction term between time and treatment.
The regression models I proposed is just a starting point, we will probably need better models than that, to take advantage of the long observation horizon.
DiD models on subsamples will give us a control group, but I am a bit concerned about the parallel trend assumption. Once we show there is effect in the first period, wouldn't that invalidate the assumption?
Vibhuti, sorry I made a mistake about subsample 3. Big businesses are kicked out of quickpay on 2/21/2013 and are included on 8/1/2014. So the time in between, before 2/21/2013 (and after 7/11/2012) till before 8/1/2014, the control group is small businesses that always receive quickpay. The treatment is no quickpay. The big businesses go from not treated (receiving quickpay) before 2/21/2013 to treated (no quickpay) after 2/21/2013.
I rewrote the four subsamples below:
The interpretation of the treatment effect in the four subsamples:
Some further amendments to the regression models:
Thank you for explaining the problem so clearly!
Hi Jie and Vlad, I think Jie's suggestions are a good place to start with the full sample analysis. I will get back to you on this when I have some updates on the results. Thanks.
And thanks, Jie, for clarifying this point!
3. Subsample 3: The large businesses go from untreated to treated. Treatment effect = effect of no quickpay on large business when small business receive quickpay.
Just floating another idea here: we could look at past observations as a control group. There can be two ways of doing this. I explain them below using large businesses as an example.
We can use large business contracts in 2009-2012 as a control group for large business contracts in 2014-2017. We can match contracts based on similar characteristics such as the same task, same firm, same subagency, and same industry code. The assumption here is that in the absence of quickpay, pattern of delays will be similar for the two groups.
Suppose a large business contract started on Dec 31, 2013 and ended on Sept 30, 2014. Then, we can predict the delays in the third quarter of 2014 (when payment was accelerated) using delays in previous quarters of the same contract. And use this predicted value as a control group for the realized value (after treatment). There is a stream of literature in economics on “unconfoundedness” that follows this approach under some assumptions. I need to look into it but here are some references:
Thank you, Vibhuti.
Need to think more about the second point.
Thanks Vlad, that's a fair point and I will think more about it.
Jie,
Do we have the model formally written out for 2014 implementation of QuickPay? We have a couple of issues open, including this one with comments, but I cannot find a unified model. Do you know where it is? If we do not have it, can you write a draft of it?
Let's assume that we will present 2009 and 2014 implementations separately, but 2009 implementation will appear first in the paper.
Thanks,
Vlad
Thanks, Jie! We have multiple issues open on this. We need to consolidate.
Yes, the argument in the file convinced me. This is visual, what is the corresponding regression? The model makes sense for the first implementation. But is it the same model for the second implementation? Vibhuti used a different model to derive results. Instead of Post variable, there is Before variable. Reading that table, we are seeing effect of slow payment.
What version of the model should be use? Can we consolidate visual representation, regression, and results (existing or rederived if model changes) in one place so that there is no ambiguity or gaps?
Yes, I will consolidate the models and visual arguments for the 2014 QuickPay.
Just to give a quick answer, the only change in the model for the 2014 QuickPay is to change _Postt to _Pret. Everything else should be the same.
In the table mentioned above, the coefficient of the interaction term, Pre x Large business (i.e., "before_aug_2014:business_typeO" term) is positive, meaning that lacking of QuickPay leads to delay. So implementing QuickPay leads to finishing early. In other words, the effect of QuickPay is the negative of the coefficient of Pre x Large business, as stated in this file.
I noticed you closed out other issues. Thank you!
In describing the model and results, let's focus on the business insights, which is: when payments to large businesses are delayed, large businesses delay project completion. Or, when payments to large businesses are expedited, large businesses expedite project completion.
Regression model
Dependent variable: Y_it = Delay of contract i observed in time t Co-variates:
Co-variates t, Q_q(t), and X_k(t) give us a nonlinear trend with seasonality. From the data, seasonality is quite clear, so let's always keep t and Q_q(t) in the time trend part. There are a few model configurations about the time trend that we can consider:
Add year fixed effects to the time trend, etc ...
Diff-in-Diff
Consider subsamples so that we always have a control group in the data. There are four such subsamples:
The parallel trend assumption is somewhat awkward, as once we discover treatment effect in sample 1, we kinda disprove the parallel trend assumption in later periods.
Anyhow, if we ignore that for now and do DiD in four subsamples, then here is the interpretation of the treatment effect in the four subsamples: