Open MaxGhenis opened 1 year ago
We are trying to display a nonlinear function by just discretizing its domain. This is bound to produce undesirable artifacts. As the discretization gets finer, the displayed function approximates the true function more closely. For instance, by increasing the density of points 10-fold, one can eliminate the "odd" corners:
To see that the discretization issue is present in other charts too, observe the same net income variation function under normal density:
and under increased density:
The second chart has a steeper slope at the cliff and more closely approximates the function.
Given these facts, there are a few options to consider:
count
parameter where the chart is close enough to the actual function and doesn't cause the chart load to slow down excessively (we may already be in the sweet spot right now). We can try larger values like 800
, 1600
, etc., and see if the performance deterioration is acceptable.Analytically determining the functional forms would likely involve significant work, given the chain of computations that drive many. Net income, for example, often involves hundreds of nonlinear computations.
Corners/discontinuities can be detected numerically similar to how we detect cliffs right now: the difference is that we need to check if the change in slope is above some threshold. Having detected the corners, we can add more samples in these regions, improving the piecewise linear approximation. I will try implementing something like this in the front end and see if it works okay. Resampling will involve more calls to the back end, which will induce a significant performance penalty. This penalty can be avoided by shifting this functionality to the back end (see next paragraph).
A better design may look like this:
We do not detect cliffs numerically, we mark them as any range of consecutive points ($500 apart) where net income falls.
Constructing functional forms could improve both detail and performance, but we should not only approximate them; any point on the plot should be exact (from which we can interpolate as we do today). I think that would go in policyengine-core; could you open an issue there to discuss further? Then we can keep this issue to the narrower scope limited to front-end changes.
We often look at low-income households, where the $500 increments produce lumpy charts. I want a sharper trapezoid