Chicago / food-inspections-evaluation

This repository contains the code to generate predictions of critical violations at food establishments in Chicago. It also contains the results of an evaluation of the effectiveness of those predictions.
http://chicago.github.io/food-inspections-evaluation/
Other
406 stars 130 forks source link

Inspections are cyclic; how does prioritizing them help? #77

Closed orborde closed 9 years ago

orborde commented 9 years ago

This is a "big picture" question about how this model would be applied to the existing inspections workflow.

From reading the whitepaper, it sounds like each establishment gets inspected once, and no more than one, by the city per inspection time window, with the time window varying by risk class. So, for example, each Risk 2 establishment sees one, and only one, inspection per year. If that's the case, how does moving a high-violation-risk establishment earlier in the inspection cycle actually help, when it also means that there will be a whole year gap until it gets inspected again?

To make this more concrete, imagine we have a Risk 2 business called X. X is poorly managed and therefore very frequently found in violation. Assuming I understood the white paper correctly, X will be inspected once a year. Imagine that we're starting a year of inspections in January.

If a predictive model correctly identifies X as highest risk, then you might schedule the X inspection for January, right at the beginning of the year, and you'd find a violation immediately. Great! But now, what about the rest of the year? What if X develops a problem in February? Sure, next year, X will be at the top of the schedule and get inspected in January again, but what about the intervening 11 months (Feb-Dec) where it isn't getting inspected because it's already been done this cycle? The early detection in one cycle is cancelled out by the delay in detection over the rest of the cycle.

Since inspections are on a rolling cycle like this, the existing whitepaper analysis showing that violating businesses will be moved to the front of the cycle may be less meaningful for the intended use case because early detections will, in statistical aggregate, be balanced out by violations that are left to fester due to the cycle time.

In fact, it should be noted that this is not a problem with the modeling work; it seems to be a wider problem with the design of the city's inspection regime. Ideally, you'd concentrate more inspection resources on high-risk businesses by inspecting them more often, but the cyclical inspection regime seems to prevent that. I'm not sure how you could achieve real improvements in results (fewer businesses operating with undetected violations) without this concentration.

Am I simply missing something important about how the inspection regime works?

tomschenkjr commented 9 years ago

I think you have a good grasp on the protocol, but yes, some loose ends to cover on how the inspection regime works. The primary intent with this repository is to help sort canvas inspections, which is one component of the food inspection manager's job. The predicted values also provide guidance any other time the manager is potentially interested in re-inspecting a restaurant

The food inspection manager is known to re-inspect restaurants if they've proven to be problematic in the past. By prioritizing riskier restaurants earlier, we can reduce exposure to the riskier restaurants and check them off our list of restaurants that need to be inspected at least once (which has been a problem in the past), but that doesn't preclude them from follow-up visits.

The food inspection manager has discretion about this, but she'll also be able to use the predicted probabilities to help her prioritize follow-up visits. Earlier identification of failures will give ample time for follow-up.

One important piece of information for the manager are resident complaints, which are reported through through 311. However, under-reporting has always been an issue as people don't know how or don't care to report to the city. So, this is where our Foodborne Chicago program is designed to help combat under-reporting through social media text mining. In which case, reports through 311 (or Foodborne) can be mediated by looking at the predicted probabilities to help prioritize the follow-ups.

Is it perfect? No, because the scenario you painted could happen. Though I don't think it'll "cancel out", but a scenario that can happen time-to-time which we may be able to combat through re-inspections and resident complaints. That is, even if it does happen, there are mechanisms to deal with it. The current inspection protocol is designed to cover a minimum base, and then allow follow-ups on top of it as resources and time allow.

In time, it will be interesting to see if our ordinances and protocols will begin to evolve based on new capabilities like this predictive model is available to them. Already, it's been interesting as Risk 2's are now considered in the same bucket as Risk 1's because it's possible the former can be riskier than the latter. As long as our predictions are accurate, we'll be able to fit into a variety of inspection regimes.

orborde commented 9 years ago

Ah, okay. It sounds like there is a pool of "discretionary" inspection resources beyond those provided for the cyclic (canvas, IIUC) inspections in the form of "follow-up" inspections, and that those can be concentrated on predicted-high-risk businesses. And that (per the Tribune article) the current inspection resources are not sufficient to cover even the canvas schedule, meaning that better prioritization of canvas inspections allows better choices about minimizing the impact of the shortfall.