TheCGO / fiscalsim-us

FiscalSim US is a microsimulation model of the US federal and state tax and benefit system relating to households and individuals.
https://thecgo.github.io/fiscalsim-us/
GNU Affero General Public License v3.0
11 stars 12 forks source link

FiscalSim-US not passing tests #10

Closed rickecon closed 1 year ago

rickecon commented 1 year ago

The goal of this issue is to identify the reason why the FiscalSim-US test is different from the OpenFisca-US test and to fix it so that the FiscalSim-US test gives the same output as the OpenFisca-US test.

The Problem

If you do the following steps:

You should be able to run the full set of tests on the repository using the make test command. When I do these same steps with the OpenFisca-US repository (fork, clone, create openfisca-us-test conda env, install openfisca_us package), I get everything passing (plus 2 appropriately skipped tests of test_acs.py), except for the 5 tests of test_against_taxsim.py. These fail for a known reason that the current tests only work on the Linux operating system.

(openfisca-us-test) richardevans@Richards-MacBook-Pro openfisca-us % make test
pytest openfisca_us/tests/ --maxfail=0
============================= test session starts ==============================
platform darwin -- Python 3.7.13, pytest-5.4.3, py-1.11.0, pluggy-0.13.1
rootdir: /Users/richardevans/Docs/Economics/OSE/openfisca-us, inifile: setup.cfg
plugins: anyio-3.6.1, dependency-0.5.1
collected 1060 items                                                           

openfisca_us/tests/test_variables.py ................................... [  3%]
........................................................................ [ 10%]
........................................................................ [ 16%]
........................................................................ [ 23%]
........................................................................ [ 30%]
........................................................................ [ 37%]
........................................................................ [ 44%]
........................................................................ [ 50%]
........................................................................ [ 57%]
........................................................................ [ 64%]
........................................................................ [ 71%]
........................................................................ [ 78%]
........................................................................ [ 84%]
........................................................................ [ 91%]
........................................................................ [ 98%]
...                                                                      [ 98%]
openfisca_us/tests/code_health/parameters.py .                           [ 98%]
openfisca_us/tests/code_health/variable_names.py .                       [ 98%]
openfisca_us/tests/microsimulation/test_against_taxsim.py FFFFF          [ 99%]
openfisca_us/tests/microsimulation/test_microsim.py .                    [ 99%]
openfisca_us/tests/microsimulation/data/test_imports.py ..               [ 99%]
openfisca_us/tests/microsimulation/data/acs/test_acs.py ss               [ 99%]
openfisca_us/tests/microsimulation/data/cps/test_cps.py ..               [100%]

When I run the tests in the fiscalsim-us-dev environment, I get the following output. The differences are that:

fiscalsim_us/tests/test_variables.py ................................... [ 3%] ........................................................................ [ 10%] ........................................................................ [ 16%] ........................................................................ [ 23%] ........................................................................ [ 30%] ........................................................................ [ 37%] ........................................................................ [ 44%] ........................................................................ [ 50%] ........................................................................ [ 57%] ........................................................................ [ 64%] ........................................................................ [ 71%] ........................................................................ [ 78%] ........................................................................ [ 84%] ........................................................................ [ 91%] ........................................................................ [ 98%] .. [ 98%] fiscalsim_us/tests/code_health/parameters.py . [ 98%] fiscalsim_us/tests/code_health/variable_names.py . [ 98%] fiscalsim_us/tests/microsimulation/test_against_taxsim.py EEEEE [ 99%] fiscalsim_us/tests/microsimulation/test_microsim.py F [ 99%] fiscalsim_us/tests/microsimulation/data/test_imports.py .. [ 99%] fiscalsim_us/tests/microsimulation/data/acs/test_acs.py ss [ 99%] fiscalsim_us/tests/microsimulation/data/cps/test_cps.py Fs [100%]


The problem seems to be something with the hdf5 data from the CPS and the interface with the `openfisca-tools` package, which is a dependency of this project (for now). I get the following error message in each case.

/opt/anaconda3/envs/fiscalsim-us-dev/lib/python3.7/site-packages/openfisca_tools/microsimulation.py: ... ... E KeyError: "Unable to open object (object 'person_id' doesn't exist)"


Another error seems to be in the interface between the `openfisca-tools/data/dataset.py` file and the `fiscalsim_us/data/datasets/cps/cps.py` file and its hdf5 implementation. The error traceback is the following.

/opt/anaconda3/envs/fiscalsim-us-dev/lib/python3.7/site-packages/openfisca_tools/data/dataset.py:69: in wrapper ... fiscalsim_us/data/datasets/cps/cps.py:36: in generate cps = h5py.File(self.file(year), mode="w") raw_data = <HDF5 file "raw_cps_2021.h5" (mode r)> self = <fiscalsim_us.data.datasets.cps.cps.CPS object at 0x7f9c2de63650> ... /opt/anaconda3/envs/fiscalsim-us-dev/lib/python3.7/site-packages/h5py/_hl/files.py:533: in init ... E OSError: Unable to create file (unable to truncate a file which is already open)



## What has been changed in FiscalSim-US
`FiscalSim-US` has had [5 PRs](https://github.com/TheCGO/fiscalsim-us/pulls?q=is%3Apr+is%3Aclosed) up to now that mainly change values of `openfisca-us` to `fiscalsim-us` in hundreds of files. I suspect that something in the process messed up the interface between the `openfisca-tools` package, the data requests in `cps.py` and the `h5py` package.

cc: @austinperryfrancis @ss7886