MuckRock / muckrock

MuckRock's source code - Please report bugs, issues and feature requests to info@muckrock.com
https://www.muckrock.com
GNU Affero General Public License v3.0
114 stars 22 forks source link

Feature request: Tagging of agencies/jurisdictions within tasks #1443

Open red-bin opened 6 years ago

red-bin commented 6 years ago

It would be extremely useful to be able to tag an agency with its agency class while creating it in the "Tasks" tab. For example, I would like to be able to tag an agency with "IT Department", or "Police Department" during the creation of that agency. This would massively helpful in the cases where departments throughout the states use different names for the same type of agency (eg, "Department of Innovation and Technology" would get the "IT department" tag).

Down the line, this would be very useful for submitting bulk requests across many states or cities.

morisy commented 6 years ago

This is a good suggestion; we currently offer the following agency types, which have not been updated in a long time and which I'm not super satisfied with. Would be very curious if there's any sort of standardized way that other groups have used to define agency types we could build off of rather than our own scattershot method.

red-bin commented 6 years ago

This manual looks to-the-letter what you're looking for. Check out page 31 - there's a list of classifiers that the census itself uses. They probably release the full list of agencies somewhere.. I'll look around.

https://www2.census.gov/govs/pubs/classification/2006_classification_manual.pdf

image

red-bin commented 6 years ago

Hm.. That classification manual isn't 100% comprehensive - for example, there isn't a mention of IT in any directly explicit way (maybe that'll change in 2020?).

https://www2.census.gov/programs-surveys/apes/datasets/2016/annual-apes/ has a list of all government agencies that responded to the US census in 2016 along with classification codes from that manual. The files aren't really that human readable, but there are 90k agencies in the list. Probably not super useful for Muckrock in that form since no contact information exists, though.

red-bin commented 6 years ago

Also, if the names are similar (with the help of some string distance comparisons), it might be easy enough to just do a 1:1 of the agency names from that file and extract the census code and use that code's alias for tagging.

morisy commented 6 years ago

Oh super useful. Yeah, having the census list seems like a good starting point... Pretty tied up the next couple of weeks but will try to dig into it, but if you (or someone else) wants to take a look and see if it's feasible to do that matching please do. Given how out of date our matching data is, I'd be willing to scrap the categories we have now in favor of moving to this, and I think there's a few broad filters we could use even if the annual-apes survey isn't useful that could dump many things into correct categories.

Depending on what data is in the survey data, might also be some cross over with #1209.

One thing that would to think about with tagging is how to handle requests in jurisdictions that centralize handling of requests, i.e. requests for information on animal control go through city hall, rather than animal control. Some of the thinking originally was then you tag that city hall with "animal control," but I'm starting to think that we should have a field on agencies allowing them to "delegate" handling their requests to another agency.

red-bin commented 6 years ago

Yep - will let you know how the matching work goes.

For a refined and somewhat utilized spec, this looks pretty useful: http://opencivicdata.readthedocs.io/en/latest/data/index.html, though it's mostly used for bills and legislative staff, rather than something general purpose similar to what that census doc tries to solve. A lot of its work looks like it died down back in 2014 and a lot of its APIs are no longer available. That said, it might still be useful and worthwhile to reach out to them on their slack and possibly talk about extending their schema with a new "OCDEP" to be more accommodating to FOIA.

Here's an example of its use: https://ocd.datamade.us/organizations/ .

Another route would be to adopt some of what Wikipedia does. For example, check out the wikidata entry for the Chicago Police Department - https://www.wikidata.org/wiki/Q1340186. In particular, the "Instance of" section uses a hierarchical classification system, which includes gov. agencies. Some clever scraping there might lead to a much fuller classification system.

WRT jurisdictions that centralize request handling... fun problem. I'm almost positive an explicit field for where to proxy requests will be needed, unfortunately. Thankfully, it can be generalized with many departments (eg, where all departments within a city are proxied through a city clerk). I think the only addition would be extra templated request language along the lines of, "Kindly route this request to X agency". I'll put some more thought into that.

red-bin commented 6 years ago

The wikipedia search kind of worked: https://pastebin.com/jyzfj7MM

https://query.wikidata.org

SELECT ?government_agency ?instance_of ?instance_ofLabel WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?government_agency wdt:P31 wd:Q327333.
  OPTIONAL { ?government_agency wdt:P31 ?instance_of. }

  OPTIONAL {  }
}
LIMIT 100000