datamade / bga-payroll

💰 How much do your public officials make?
4 stars 4 forks source link

Clean up obsolete responding agencies & reupload affected source files #499

Closed hancush closed 4 years ago

hancush commented 4 years ago

32 responding agencies have new names in the latest version of the FOIA source list.

['All Elementary/High School Employees',
 'Elbum Countryside FPD',
 'Godley Park District',
 'Fox Valley  Park District',
 'Metropolitan Pier and Exposition Authority',
 'Public Building Commission of Chicago',
 'City of Chicago',
 'DeKalb',
 'DeWitt County',
 'Marywood FPD',
 'DeKalb County',
 'Marengo - Union Library District',
 'Pace Suburban Bus Service',
 'Deerfield-Bannockburn FPD',
 'LaGrange Park District',
 'Algonquin Township',
 'Lake in the Hills',
 'Lan-Oak Park District',
 'Addison Park District',
 'Illinois Community Colleges Board',
 'Homewood Flossmoor Park District',
 'LaGrange Park Library District',
 'Fox River Grove FPD',
 'LaGrange Library District',
 'City Colleges of Chicago',
 'Town-Country Library District',
 'CRETE',
 'McConathy Public Library',
 'PulaskI County',
 'Helen M Plum Memorial Library District',
 'North Aurora FPD',
 'IBHE']

I've removed these from the database.

>>> with open('data/raw/foia-source-lookup.csv', 'r') as f:
...     reader = csv.reader(f)
...     next(reader)
...     agencies = [employer for employer, _ in reader]
...
>>> RespondingAgency.objects.filter(name__in=agencies).count()
835
>>> not_in_list = RespondingAgency.objects.exclude(name__in=agencies)
>>> not_in_list.count()
32
>>> pprint.pprint(list(not_in_list.values_list('name', flat=True)))
['All Elementary/High School Employees',
 'Elbum Countryside FPD',
 'Godley Park District',
 'Fox Valley  Park District',
 'Metropolitan Pier and Exposition Authority',
 'Public Building Commission of Chicago',
 'City of Chicago',
 'DeKalb',
 'DeWitt County',
 'Marywood FPD',
 'DeKalb County',
 'Marengo - Union Library District',
 'Pace Suburban Bus Service',
 'Deerfield-Bannockburn FPD',
 'LaGrange Park District',
 'Algonquin Township',
 'Lake in the Hills',
 'Lan-Oak Park District',
 'Addison Park District',
 'Illinois Community Colleges Board',
 'Homewood Flossmoor Park District',
 'LaGrange Park Library District',
 'Fox River Grove FPD',
 'LaGrange Library District',
 'City Colleges of Chicago',
 'Town-Country Library District',
 'CRETE',
 'McConathy Public Library',
 'PulaskI County',
 'Helen M Plum Memorial Library District',
 'North Aurora FPD',
 'IBHE']
>>> not_in_list.delete()
(225, {'data_import.RespondingAgencyAlias': 32, 'data_import.StandardizedFile_responding_agencies': 50, 'payroll.UnitRespondingAgency': 79, 'data_import.RespondingAgency': 32, 'data_import.SourceFile': 32})

@deraj1013, you'll need to reupload source files for 2017 for these agencies under their updated names. For reference, this is the latest version of the agency list: https://github.com/datamade/bga-payroll/blob/62b0b83beea3df1ffb3fcb724f6882d115ca8933/data/raw/foia-source-lookup.csv

Will ping you when I'm ready for you!

hancush commented 4 years ago

Related to #487

hancush commented 4 years ago

Also cleaned up ISBE and ISBE/All Elementary/High School Employees. ISBE corresponds to ISBE itself, ISBE/All Elementary/High School Employees is school employees. The latter needs a source file for 2017.

deraj1013 commented 4 years ago

Sounds good. I can get the files whenever it is ready.