When I work with this repository, I want to be able to keep updating it for as long as necessary without reaching the 100MB/file limit so I can keep getting valuable information out of the generated data.
Acceptance
[x] The size of any individual file, especially the larger ones for US, has been drastically reduced so that it won't reach the file size limit anytime soon and will not keep growing as fast as it does right now.
Tasks
[x] Strip out the fixed data from those files, such as lat/long and other columns that can belong in one file that never changes.
[x] Update the data package manipulations in the code when generating datapackage.json so that we do not try to include nonexistent columns.
Analysis
It is suboptimal to copy and paste the same data points over and over again, especially given the fact that they are not expected to change. Only variable data should appear in the CSV files that are used to display graphs on datahub.io. The rest belongs in other less sensitive files. This will take a lot less space that way.
When I work with this repository, I want to be able to keep updating it for as long as necessary without reaching the 100MB/file limit so I can keep getting valuable information out of the generated data.
Acceptance
Tasks
datapackage.json
so that we do not try to include nonexistent columns.Analysis
It is suboptimal to copy and paste the same data points over and over again, especially given the fact that they are not expected to change. Only variable data should appear in the CSV files that are used to display graphs on datahub.io. The rest belongs in other less sensitive files. This will take a lot less space that way.