PolicyEngine / policyengine-app

PolicyEngine's free web app for computing the impact of public policy.
GNU Affero General Public License v3.0
35 stars 89 forks source link

Restrict the y-axis of the Marginal Tax Rates chart to `[-200%, 200%]`. #977

Closed abhcs closed 5 months ago

abhcs commented 5 months ago

Fix #66 by filtering function values that exceed 200% or go under -200%.

Tested against this household.

View when current income is 0:

Screenshot 2023-12-18 at 10 25 23 AM

View when current income is 14,000:

Screenshot 2023-12-18 at 10 26 52 AM

Thus, when the value corresponding to the current income does not lie in [-200%, 200%] the range on the y-axis is extended to include the value.

anth-volk commented 5 months ago

Thanks for your work on this, @abhcs. Any idea why the chart in your provided example cuts off around 100%? Is it because there's technically only one value above that point, and it's arbitrarily close to infinity? I'm just wondering if it'd be valuable, when the chart's values exceed 100%, to ensure that 100% is displayed.

abhcs commented 5 months ago

There are two peaks above 200 but we ignore them, so the range of y-values is [-90, 70]. We add a padding of 0.1 x (70 - (-90)) = 16. The padded range of [-106, 86] is displayed. The mentioned calculations are approximations.

anth-volk commented 5 months ago

Gotcha. @MaxGhenis do you feel the 100% axis should be displayed if there are any values above it? I would personally lean toward it.

abhcs commented 5 months ago

Just thinking about it a bit, if most of the values are within say [0,10] and there is one outlier of 700, then the current design would keep the axis range around [0,10] with some padding, which is desirable, I think. In such a case, having a forced range of [0,100] would shrink the features.

The MTR function displayed in the chart is multidimensional with a vast domain. Still, we are making decisions about how to display the function based on a few samples from the domain, i.e., viewing the website for a particular household. This is a bit dangerous and the magic constants like 200 and 100 may not work for other households.

A more robust solution would be to find the outliers in the back end without relying on constants (I have code for this by using a Python lib for 1-D clustering). However, this would require changes to the API. Therefore, I have gone ahead with the current PR.

All that being said, I am happy to add the 100 constant too.

MaxGhenis commented 5 months ago

Hi sorry I'm getting to this late. Can we instead set the y axis limits, rather than the values? So take the bounds of the data, add e.g. 10% on either side, and then clip those bounds at +/-200% for setting y axis limits, but leave the values as they are.

abhcs commented 5 months ago

We are changing axis limits and not values in this PR. In the two examples posted in the initial comment, do you expect to see something else? Please clarify. If you are looking at the code then the term yaxisValues is only fed as an argument to getPlotlyAxisFormat.

anth-volk commented 5 months ago

He had been thinking a modification to the code that, instead of merely filtering values outside of the bounds, would determine if values exist above and/or below, and then set the upper and lower bounds at +/- 2 accordingly, as merely filtering dynamically sets the chart range based upon the remaining values

abhcs commented 5 months ago

Screenshots after the latest changes for the same household with different incomes:

For income = 0:

0

For income = 14,000:

14k

For income = 55,000:

55k
MaxGhenis commented 5 months ago

can we trim the 735% example too?

abhcs commented 5 months ago

We can but the current behavior is by design! If we were to trim the 735% case, then

I have kept the current values visible in the chart to avoid this behavior. An alternative could be to not display the current value and the legend and still trim! Please let me know if you want this. I thought that it would be confusing for the user to see the current value legend/point on plot for some income values and not for others.

MaxGhenis commented 5 months ago

Ok sounds good let's try this and we can come back to it if it's confusing