PolicyEngine / policyengine-us

The PolicyEngine US Python package contains a rules engine of the US tax-benefit system, and microdata generation for microsimulation analysis.
https://policyengine.org/us
GNU Affero General Public License v3.0
100 stars 174 forks source link

NYC CDCC divide warning #2494

Closed martinholmer closed 1 year ago

martinholmer commented 1 year ago

@MaxGhenis, You changed line 27 in the nyc_cdcc_age_restricted_expenses.py module just recently to avoid a divide-by-zero situation.

Why does this line generate the following warning when the tests are run on GitHub?

What is the "invalid value"?

policyengine_us/tests/microsimulation/test_microsim.py::test_microsim_runs_cps
  /home/runner/work/policyengine-us/policyengine-us/policyengine_us/variables/gov/local/ny/nyc/tax/income/credits/cdcc/nyc_cdcc_age_restricted_expenses.py:27: RuntimeWarning: invalid value encountered in true_divide
    children > 0, qualifying_children / children, 0
MaxGhenis commented 1 year ago

np.where computes all three array args, so even though it doesn't return any /0 nans, it will still warn about them:

https://github.com/PolicyEngine/policyengine-us/blob/3afcd285ca7bb11680c64440da4610de645cbde9/policyengine_us/variables/gov/local/ny/nyc/tax/income/credits/cdcc/nyc_cdcc_age_restricted_expenses.py#L26-L28

Here's a potential solution:

mask = children > 0
qualifying_child_share = np.zeros_like(children)
qualifying_child_share[mask] = qualifying_children[mask] / children[mask]

Given this occurs in several places, we could also make a new function in policyengine-core to divide in a way safe for zeros:

def safe_divide(numerator, denominator):
    mask = denominator != 0
    res = np.zeros_like(denominator)
    res[mask] = numerator[mask] / denominator[mask]
    return res

I think we should prefer this to suppressing the warning.

martinholmer commented 1 year ago

@MaxGhenis said in issue #2494:

Given this occurs in several places, we could also make a new function in policyengine-core to divide in a way safe for zeros:

def safe_divide(numerator, denominator):
   mask = denominator != 0
   res = np.zeros_like(denominator)
   res[mask] = numerator[mask] / denominator[mask]
   return res

I think we should prefer this to suppressing the warning.

I agree that "suppressing the warning" is not a good option.

My only concern with the safe_divide function is that I'm not sure zero is always going to be the non-masked quotient.

martinholmer commented 1 year ago

Perhaps the safe_divide function could have a third optional argument nan_value (need a better name) that has a default value of zero.

martinholmer commented 1 year ago

@MaxGhenis, Here's an example of a where() with the NaN value being one (instead of the more typical zero).

policyengine_us/tests/microsimulation/test_microsim.py::test_microsim_runs_cps
  /home/runner/work/policyengine-us/policyengine-us/policyengine_us/variables/
    gov/states/md/tax/income/agi/subtractions/md_two_income_subtraction.py:30:
  RuntimeWarning: invalid value encountered in true_divide
    couple_gross_income > 0, head_gross_income / couple_gross_income, 1
martinholmer commented 1 year ago

Issue #2494 has been resolved by the merge of pull request #2549.