uga-libraries / congressional-mail

Providing basic access to metadata from congressional correspondent system exports.
Creative Commons Attribution Share Alike 4.0 International
0 stars 0 forks source link

CMS Data Interchange Format #4

Open amhanson9 opened 1 month ago

amhanson9 commented 1 month ago

Remove columns with PII (most granular identifying detail should be the zip code) and make an additional copy of the data split into one spreadsheet per Congress Year (two years, starting on odd years) so large data sets can be opened in spreadsheet programs.

CMS Data Interchange Format has metadata in many tables, with ids that can be used to link them to other tables. The files are tab delimited. See the Word document with the files for more details. They are close to but not exactly the same as CSS Data Interchange Format.

amhanson9 commented 3 weeks ago

Currently combining select fields from 1B (location), 2A (dates), 2B (topics) and 2C (response). The files for these tables should be located in a single folder, and the path to that folder is the script argument. These tables cover the same information as the CSS Data Interchange Format, but some columns are only in one of the exports and other columns are in both but have different names.