PSLmodels / Business-Taxation

USA Corporate and Pass-Through Business Tax Model
11 stars 10 forks source link

Add new tests of BusinessModel.update_btax_params method #84

Closed martinholmer closed 5 years ago

martinholmer commented 5 years ago

Based on work investigating the test failures in PR #82, this pull request adds tests that show that the BusinessModel.update_btax_params method is not working correctly for at least 19 parameters that end in _bonus or _hc.

Executing the following command in the top-level directory of the repo, runs 19 tests and they all fail.

cd biztax
pytest -m one
cd ..
codykallen commented 5 years ago

@martinholmer, something seems odd. I just tried implementing the reform in example.py, which includes changes to the parameters ending in _bonus and _hc, and the changes worked fine. Furthermore, looking at your test, it should fail for ['depr_3yr_bonus', 'depr_5yr_bonus', 'depr_7yr_bonus', 'depr_10yr_bonus', 'depr_15yr_bonus', 'depr_20yr_bonus'] because the default baseline values for these are not all zeros.

martinholmer commented 5 years ago

@codykallen, Thanks for the feedback. Perhaps the problem is with this idiom, which is used in the tests/conftest.py file:

bizmod = BusinessModel(btax_dict1, {})
reform_dict[1] = bizmod.btax_params_ref

Using the existing code (which does not have a Policy class), I thought that was the way to transform btax_dict (a reform dictionary) into bizmod.btax_params_ref (a 2014-2027 DataFrame containing all the policy parameter values).

And that's the idiom being tested in the proposed tests/test_update.py in PR #84.

Maybe the idiom is the problem, not the BusinessModel.update_btax_params method.

What's the correct way to generate a policy parameters DataFrame from a reform dictionary using the existing code?

Using the new Policy class, it is straightforward and self-contained:

policy = Policy()
policy.implement_reform(reform_dict)
policy_params = policy.parameters_dataframe()
martinholmer commented 5 years ago

@codykallen, I fixed the mini_params_btax.csv file so that real parameters are in fact stored in the Pandas DataFrame as floats (rather than as integers). I also, fixed the new test_update so that the _bonus parameters that are not all zeros are excluded from the test. With those two changes, all 13 of the test_update tests pass.

My working hypothesis is that running the other tests using the new mini_params_btax.csv file will cause many fails and that the actual results generated under PR #84 will be exactly the same actual results generated under PR #82, which uses the new Policy class and the new policy_current_law.json file.

codykallen commented 5 years ago

@martinholmer said:

Using the existing code (which does not have a Policy class), I thought that was the way to transform btax_dict (a reform dictionary) into bizmod.btax_params_ref (a 2014-2027 DataFrame containing all the policy parameter values).

Yes, that's correct. That's why I'm having trouble understanding why you would have a lot of test failures for the _hc parameters when eyeballing the resulting DataFrames showed that it was fine.

martinholmer commented 5 years ago

@codykallen, Thanks for confirming that I'm using the correct idiom for converting a reform dictionary into a dataframe of policy parameters.

And I agree with you that the logic of the BusinessModel.update_btax_params method looks fine.

The problem is with Pandas doing something you didn't expect (as I have often found to be the case in my own work). The Pandas read_csv method, if given just the name of the file to read, will infer the datatype of each column in the returned dataframe by looking at the values in that column of the CSV-formatted file. So, the _hc values in mini_params_btax.csv file (and some of the _bonus values) were 0 on every data row, which Pandas interprets as the integer zero (not the floating-point value 0.0). So, when the BusinessModel.update_btax_params method attempted to insert a fractional haircut value (like 0.5), the code saw that the intended variable had a datatype of integer, so it truncated the fractional value to the integer zero.

I found that by adding the decimal points in the mini_params_btax.csv file, I was able to give Pandas guidance on the appropriate datatype (float) for all the _hc parameters and the _bonus parameters that were all zeros. After that change, many of the test results changed. And they changed in exactly the same way as they change when introducing the new Policy class in PR #82. The Policy class is a specialized taxcalc.Parameter class, and that Parameter class makes use of the datatype information in the JSON file it reads (in this case the new policy_current_law.json file in PR #82) to ensure that each parameter has the intended datatype.

Another way to think about all this, is that you made the same mistakes in using Pandas as I did in my early work on Tax-Calculator. By reusing the policy-related Tax-Calculator capabilities, we have found those mistakes.

martinholmer commented 5 years ago

@codykallen, If my latest comment makes sense, the course of action is pretty simple:

  1. You merge PR #84 into the master branch
  2. I'll incorporate the new master branch into PR #82
  3. I'll confirm that the tests all pass in PR #82
  4. You merge PR #82 into the master branch

Does this make sense to you?

codykallen commented 5 years ago

@martinholmer, thanks for the clarification and explanation of the incorrect type parameter reading. I'll follow your recommended changes on this.