opensandiego / mealscount-backend

Optimizing a free-meal reimbursement program for K-12 schools
MIT License
12 stars 15 forks source link

Improve Description of Problem (and maybe additional parameters) #77

Open themagicbean opened 3 years ago

themagicbean commented 3 years ago

I think we could improve description of the problem and the goals of the algorithms. I have read the readme, but will disclaim that I only scanned any of the actual algorithms, so if some of this is already addressed, please forgive that (but we could still improve the description in the readme for newcomers IMHO):

https://frac.org/wp-content/uploads/making-cep-work-with-lower-isps.pdf states that: (1) the ISP of a group (multiple schools) is a weighted average of the schools' ISPs not just an average (and a group can be >2 schools); (2) reimbursement is on a sliding scale between ISP 40 and 65 percent, based on 1.6xISP; it also implies (see calculation in "grouping" on page 3) that (3) schools below 40 can be pulled above 40 by grouping.

What about the effect of pulling a below-40 school to above-40? (Following math assumes all schools same size for simplicity in dealing with weighting averages for examples. Consider every school to have 100 students and disregard the percent signs if you please.) Would that not be greater gains than pulling 40%<x<=62.5% schools to above 62.5? (A 40% school pulled to 65% gains 36% paid meals. A below-40% school pulled to just-above-40% group average gains 64 paid meals.) In the latter scenario, a >62.5% school pulled all the way down to a group with 40% would lose 36 meals, for a net gain of 28 meals across the group. This is worse, but so long as the group ISP average is 45% or above the net gain is 36 meals or more, based on the assumption the previously higher school was at >62.5%, 100% free. If a school with 43% and 37% combine, the gain is 64 - 4.8 or 61.2. The only better case would be 65.5% and 37%, where the 64 meals are all gain.

Hence: Should we not also look at algorithms focused on the below- (and especially near-) 40 and how to "bring them in" as well? At least, we don't need to pull a school to >62.5 or ignore it. Any increase is beneficial. The question is one of magnitude. (That said, there could be cooperation problems among the schools, essentially giving one budget to another. I think larger bodies like School Districts and County Offices could resolve those, though.)

(Also, at a glance, it seems like any combination of schools between 40% and 62.5% with resulting average ISP in that range would have no net effect. I.e., merging 50% and 40% to result in 45% (assuming same size for weighting) just redistributes and does not result in gain. So those cases could be eliminated.)

I think this is particularly true based on a survey of the sample data. Only some of the poorest counties see several schools in the 60s and 70s (i.e., Fresno) and could realistically target maximizing 100% free. Scanning LA, I didn't see a lot of schools in the 60s. But a lot more fall into the mid range. So targeting those may be better.

In the spirit of this, I think we could describe the regulatory and algorithmic problem clearer and maybe broaden the horizon for algorithmic goals?

(Thank you for working on this and I hope to me more useful in the future.)