DataKind-DC / CARES

US CARES Act Payment Protection Program data, cleaned for analysis
GNU General Public License v3.0
6 stars 7 forks source link

evaluate data cleanliness of 0808 data, and attempt diff with prior adbs file #34

Open JohnMcCambridge opened 4 years ago

JohnMcCambridge commented 4 years ago

0808 is a fully refreshed data set, from what I can see. It excludes all cancelled or refunded loans according to the data sheet that comes with it. Therefore we should work from this new dataset for all core analyses.

The old dataset does provide some interesting opportunities for diffing. However, some of the changes in the new dataset are simply fixes to the old dataset, and of course neither contains any unique identifying keys for each loan, so a true diff on a loan-to-loan basis is not possible