Hi everyone, I added the rows for zero delays in the sample as we discussed. I made sure that a project only appears until the time it finishes or until the end of our time horizon, whichever is sooner. That is, there are no "phantom projects" in the data. I reran the results for delay rate with this new sample and there's no qualitative change. See, for example, the results for percentage delay rate here.
Jie: Here's the dropbox link for the new clean dataset in case you'd like to use it for the logistic regression.
Hi everyone, I added the rows for zero delays in the sample as we discussed. I made sure that a project only appears until the time it finishes or until the end of our time horizon, whichever is sooner. That is, there are no "phantom projects" in the data. I reran the results for delay rate with this new sample and there's no qualitative change. See, for example, the results for percentage delay rate here.
Jie: Here's the dropbox link for the new clean dataset in case you'd like to use it for the logistic regression.