opensandiego / mealscount-backend

Optimizing a free-meal reimbursement program for K-12 schools
MIT License
12 stars 15 forks source link

Update algorithm to allow for input of entire statewide DB #12

Closed nikolajbaer closed 5 years ago

nikolajbaer commented 5 years ago

The CUPC data for the entire state should be able to be used in generating optimizations, and the time require to optimize under algo v2 is currently not that large.

Ideally we can update the algorithm so we can run all passes at once, and generate an output database of summarizing statistics (e.g. optmization % of students now eligible vs baseline, increase in funding), and then have this available in a statically generated web page that we can direct people to as our partner engages with nutrition directors and CFOs of interested districts.

I think we will nee to be able to run the entire dataset, and save results for them to a data file (maybe a big JSON file) that we can use in an HTML display.

The other part of this would be to have a way to include already existing optimizations (some schools have optimized already, especially with smaller sets, or there are some districts small enough to do an exhaustive calculation to find an optimal result). We would want to include this as our "baseline" as opposed to something more naive (like a school-to-group association).

nikolajbaer commented 5 years ago

Statewid DB for 2017/2018 is here:

https://www.cde.ca.gov/ds/sd/sd/filescupc.asp

nikolajbaer commented 5 years ago

The "validate" branch now has a new, simplified setup where you can run the "mc algo v2" (the binning strategy), as well as some more naive "one-group" and "one-to-one" stragies, across the entire calpads dataset (201718 as a csv is bundled in the branch).

In addition, you can run a comparison by specifying a "strategy" and a "baseline". Right now this is a command-line interface, the end goal being to move to a more interactive web format.