Closed mattwigway closed 8 years ago
Actually, I misspoke. We can't compute an average access value unless we know all of the travel times, because of the partial accessibility problem. When a location is completely inaccessible for some portion of the time we cannot generate an average based on the average travel time, because the average travel time can't include those completely unreachable times (if you include infinity in an average, you won't like the results).
Oh wait though. I'll bet we can just store a separate array parallel to the surface with what proportion of the time each destination is accessible, and downweight by that as needed.
And we need to do that anyways because our averages are currently wrong when a destination is inaccessible for part of the travel time window. We currently divide by surface.nMinutes but that's only valid if a destination was reachable 100% of the time.
Hmm, I'm actually not sure this solves the accessibility problem though. Consider a case where the average is 59 minutes and 20% of the time it is not reachable at all. Is 80% of the jobs at this destination the same as the average accessibility if you computed it for every minute? It can't be, because some of the times when the destination was reachable are over 60 minutes, so they would be 0 in the true average accessibility calculation.
Of course the expectation may still be correct, because there is likely another destination that is sometimes reachable in 60 minutes but is on average reachable in 62, which won't be included at all if we compute average travel time and use it to compute average accessibility, weighting or no weighting.
One other point is more epistemological and relates to what we're trying to measure. Even though we're reporting the average number of jobs accessible, that doesn't really capture the true picture of things. No individual cares about how many total jobs they can reach, they care about whether they can reach their job(s) (which are small in number relative to the total number of jobs in the city). Consider these three scenarios:
If we compute average accessibility by computing accessibility at every minute, both scenarios show that 100,000 jobs are reachable. If we compute it using the average travel time, but downweighting by percent of time the destination is accessible at all, the first is still 100,000, the second is still 100,000 (0.5 * 100,000 + 0.5 * 100,000), and the third is 0 (because the average travel time is now 67.5 minutes).
Well that kind of calls into question the whole idea of using average travel time to derive accessibility (which is also what we've been doing in Analyst up to this point, see opentripplanner/OpenTripPlanner#2148). Adding lines should never cause decreases in accessibility. So I guess that's out.
I'll have to think about this more when I'm not on an airplane.
Ah, what we should do is just store a weight-per-pixel for a single cutoff (e.g. 60 minutes) and then apply that to the grids as needed. This means that changing the cutoff requires calculating a new surface to get accessibility numbers (although the isochrones won't, meaning those jaw-dropping smooth animations will still work). The complexity of adding additional grids is then roughly O(1).
Fixed in a188788225ed8147ce47090701a8ceac1b59a555
So we can switch grids easily.
It should be really fast to compute access values, it's less math than doing the isochrones. It's probably all the floating point math.