jsoma / data-studio-projects

12 stars 18 forks source link

Tracking Government Accountability Office Reports [Project] #47

Closed maijaliisa closed 6 years ago

maijaliisa commented 7 years ago

Please complete all of the following sections, or a robot will spookily dance around your issue! A completed version of this template can be found at https://github.com/jsoma/data-studio-projects/issues/1

Pitch

Track GAO reports over the past decade to see if there has been a change in what the 'Congressional Watchdog' group is investigating.

Summary

The US Government Accountability Office (GAO) audits, evaluates and investigates how the federal government spends taxpayer money. While all of the declassified GAO reports are available as PDFs online, summary information about the types of cases are not available. Reports are initiated at the request of members of Congress (mostly committee chairs and ranking minority members). Exploring the type of requests filed with GAO - along with what requests end with filing recommendations - might provide some interesting new looks at Congress throughout the years. For the purpose of this project, I will only look at Legislative Branch/Agency reports.

Details

Possible headline(s): GAO reports on Legislative Branch show XXXXXX (something found from scraping and analyzing)

Data set(s): https://www.gao.gov/browse/agency/Legislative

Code repository: https://github.com/maijaliisa/studio-projects/blob/master/code/GAO_Reports

Possible problems/fears/questions: -So many...mostly I am not sure how I am going to visualize the data later on. But I think it could be an interesting story if I am able to create a dataset that include the following information: What agency GAO was auditing If they gave a recommendation or not? Potentially scraping a summary of the report Date of report filed -This might be a change over time (year-over-year) to see what areas are most looked into by GAO

-I am also worried about being able to tag everything appropriately...but I think I will be able to do this with OpenRefine and/or RegEx

Work so far

-I've determined that I have to narrow down to just one part of government (since GAO looks at all governmental agencies). I will look at the Legislative Branch to see if there are any interesting findings about investigating Congress and Congressional Agencies. -I've also done some preliminary research into GAO...it's history, reason, and a bit about the data they collect. This will help create a more robust story/contextualize what I end up finding in the dataset. I would like to do a very simplified version of this: screen shot 2017-07-25 at 1 03 09 am Where lines represent the types of Reports (Audit, Recommendation, general report). This first graphic could show change over time with specific delineation based on the Congressional term.

It might also be interesting to chart them based on the four Support Agencies - The Library of Congress, the Congressional Budget Office, the General Accounting Office, and the Government Printing Office.

Checklist

This checklist must be completed before you submit your draft.

wonderfulexperience commented 7 years ago

I think a simple chart would be actually advantageous. Too many different information would make it harder to see underlying trends. A simple story like "Congress cares less about military spending than ever before" would be an interesting headline.

SYChJung commented 7 years ago

Allocation of budget is a major role of the government. I think this project could be useful in making sure the government is properly doing its job.

maijaliisa commented 7 years ago

I spent several days trying to figure out how to appropriately scrape all of the open issues...and realized it will take a lot longer than I anticipated. I've decided to narrow down just to open PRIORITY RECOMMENDATIONS. I've determined open priority areas based on most frequent topics and most frequent agencies with open recommendations.

Open Priorities By Agency.pdf Open Priorities By Topic.pdf

My next step is to determine these areas based on date....because I want to chart what departments/topics have the longest open recommendations and if there has been any movement on them.

maijaliisa commented 7 years ago

screen shot 2017-07-28 at 9 16 36 am

maijaliisa commented 7 years ago

screen shot 2017-07-28 at 9 17 38 am

maijaliisa commented 7 years ago

https://github.com/maijaliisa/studio-projects/tree/master/code/GOA_Report

maijaliisa commented 7 years ago

Another update on graphics: totaltrue openpriorityitemcount

sarahslo commented 7 years ago

so i really like the treemap. i think you capture the story very quickly there. there is nothing in the judicial branch? if so, remove it from the key. also, i'd align the dark red of congress on the bottom. that way the color will hang together more and we'll see the data.

i don't understand the dot plot above. when i look at these charts i want to understand them by looking. if should not have to read through the explanation...what should the label be on that chart. what does 100 stand for? i'm curious about the data.

maijaliisa commented 7 years ago

Update

https://maijaliisa.github.io/studio-projects/a-very-cool-project/GitHubFirstGAO/ 

Content

Any changes in direction or topic?

-I am focusing on Government Operations and Cybersecurity topics because they have the most open priority items 

Problems/Questions

Checklist