PolicyEngine / policyengine-us

The PolicyEngine US Python package contains a rules engine of the US tax-benefit system, and microdata generation for microsimulation analysis.
https://policyengine.org/us
GNU Affero General Public License v3.0
100 stars 174 forks source link

CPS and microsim tests crash on master branch 0.254.1 version #2005

Closed martinholmer closed 1 year ago

martinholmer commented 1 year ago

I can't do make test on my computer using the current version of PolicyEngineUS. @nikhilwoodruff, looks like you need to do more testing of your data changes.

======================= 1525 passed in 87.17s (0:01:27) ========================
coverage xml -i
Wrote XML report to coverage.xml
pytest policyengine_us/tests/ --maxfail=0
============================= test session starts ==============================
platform darwin -- Python 3.9.13, pytest-5.4.3, py-1.11.0, pluggy-0.13.1
rootdir: /Users/mrh/work/policyengine-us
plugins: dependency-0.5.1
collected 1265 items                                                           

policyengine_us/tests/test_variables.py ................................ [  2%]
........................................................................ [  8%]
........................................................................ [ 13%]
........................................................................ [ 19%]
........................................................................ [ 25%]
........................................................................ [ 30%]
........................................................................ [ 36%]
........................................................................ [ 42%]
........................................................................ [ 48%]
........................................................................ [ 53%]
........................................................................ [ 59%]
........................................................................ [ 65%]
........................................................................ [ 70%]
........................................................................ [ 76%]
........................................................................ [ 82%]
........................................................................ [ 87%]
........................................................................ [ 93%]
...................................................................      [ 98%]
policyengine_us/tests/microsimulation/test_against_taxsim.py sssss       [ 99%]
policyengine_us/tests/microsimulation/test_microsim.py F                 [ 99%]
policyengine_us/tests/microsimulation/data/test_imports.py ..            [ 99%]
policyengine_us/tests/microsimulation/data/cps/test_cps.py FFFsss        [100%]

=================================== FAILURES ===================================
____________________________ test_microsim_runs_cps ____________________________

    def test_microsim_runs_cps():
        from policyengine_us import Microsimulation

>       sim = Microsimulation()

policyengine_us/tests/microsimulation/test_microsim.py:4: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
//Users/mrh/anaconda3/envs/policyengine-us/lib/python3.9/site-packages/policyengine_core/simulations/simulation.py:131: in __init__
    self.build_from_dataset()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <policyengine_us.system.Microsimulation object at 0x7f8f7dd63fa0>

    def build_from_dataset(self) -> None:
        """Build a simulation from a dataset."""
        self.build_from_populations(
            self.tax_benefit_system.instantiate_entities()
        )
        from policyengine_core.simulations.simulation_builder import (
            SimulationBuilder,
        )  # Import here to avoid circular dependency

        builder = SimulationBuilder()
        builder.populations = self.populations
        try:
>           data = self.dataset.load(self.dataset_year)
E           TypeError: load() missing 1 required positional argument: 'year'

//Users/mrh/anaconda3/envs/policyengine-us/lib/python3.9/site-packages/policyengine_core/simulations/simulation.py:181: TypeError
_____________________ test_cps_dataset_generates[CPS_2021] _____________________

year = <class 'policyengine_us.data.datasets.cps.cps.CPS_2021'>

    @pytest.mark.dependency(name="cps")
    @pytest.mark.parametrize("year", CPS_YEARS)
    def test_cps_dataset_generates(year):
>       year()

policyengine_us/tests/microsimulation/data/cps/test_cps.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <policyengine_us.data.datasets.cps.cps.CPS_2021 object at 0x7f8f7dd0aa30>

    def __init__(self):
        # Setup dataset
        if self.folder_path is None:
>           raise ValueError(
                "Dataset folder_path must be specified in the dataset class definition."
E               ValueError: Dataset folder_path must be specified in the dataset class definition.

//Users/mrh/anaconda3/envs/policyengine-us/lib/python3.9/site-packages/policyengine_core/data/dataset.py:40: ValueError
______________ test_cps_dataset_generates[UpratedCPSFromDataset0] ______________

year = <class 'policyengine_us.data.datasets.cps.uprated_cps.UpratedCPS.from_dataset.<locals>.UpratedCPSFromDataset'>

    @pytest.mark.dependency(name="cps")
    @pytest.mark.parametrize("year", CPS_YEARS)
    def test_cps_dataset_generates(year):
>       year()

policyengine_us/tests/microsimulation/data/cps/test_cps.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <policyengine_us.data.datasets.cps.uprated_cps.UpratedCPS.from_dataset.<locals>.UpratedCPSFromDataset object at 0x7f8f7dd52eb0>

    def __init__(self):
        # Setup dataset
        if self.folder_path is None:
>           raise ValueError(
                "Dataset folder_path must be specified in the dataset class definition."
E               ValueError: Dataset folder_path must be specified in the dataset class definition.

//Users/mrh/anaconda3/envs/policyengine-us/lib/python3.9/site-packages/policyengine_core/data/dataset.py:40: ValueError
______________ test_cps_dataset_generates[UpratedCPSFromDataset1] ______________

year = <class 'policyengine_us.data.datasets.cps.uprated_cps.UpratedCPS.from_dataset.<locals>.UpratedCPSFromDataset'>

    @pytest.mark.dependency(name="cps")
    @pytest.mark.parametrize("year", CPS_YEARS)
    def test_cps_dataset_generates(year):
>       year()

policyengine_us/tests/microsimulation/data/cps/test_cps.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <policyengine_us.data.datasets.cps.uprated_cps.UpratedCPS.from_dataset.<locals>.UpratedCPSFromDataset object at 0x7f8f7decb7f0>

    def __init__(self):
        # Setup dataset
        if self.folder_path is None:
>           raise ValueError(
                "Dataset folder_path must be specified in the dataset class definition."
E               ValueError: Dataset folder_path must be specified in the dataset class definition.

//Users/mrh/anaconda3/envs/policyengine-us/lib/python3.9/site-packages/policyengine_core/data/dataset.py:40: ValueError
=========================== short test summary info ============================
FAILED policyengine_us/tests/microsimulation/test_microsim.py::test_microsim_runs_cps
FAILED policyengine_us/tests/microsimulation/data/cps/test_cps.py::test_cps_dataset_generates[CPS_2021]
FAILED policyengine_us/tests/microsimulation/data/cps/test_cps.py::test_cps_dataset_generates[UpratedCPSFromDataset0]
FAILED policyengine_us/tests/microsimulation/data/cps/test_cps.py::test_cps_dataset_generates[UpratedCPSFromDataset1]
================== 4 failed, 1253 passed, 8 skipped in 25.29s ==================
make: *** [test] Error 1
martinholmer commented 1 year ago

Data tests also crash on GitHub as can be seen in pull request #2000.

martinholmer commented 1 year ago

Whatever recent data enhancement you have implemented has broken my long-working TAXSIM35 testing framework. Now I can't do any local validation testing work. Here is the error I'm getting:

(policyengine-us) WA% YEAR=21 ./tests.sh ; say done
Traceback (most recent call last):
  File "/Users/mrh/work/Policy-Engine-US/WA/../execute_test.py", line 504, in <module>
    sys.exit(main())
  File "/Users/mrh/work/Policy-Engine-US/WA/../execute_test.py", line 395, in main
    vdset = VAL()
  File "//Users/mrh/anaconda3/envs/policyengine-us/lib/python3.9/site-packages/policyengine_core/data/dataset.py", line 39, in __init__
    raise ValueError(
ValueError: Dataset file_path must be specified in the dataset class definition.
Traceback (most recent call last):
  File "/Users/mrh/work/Policy-Engine-US/WA/../execute_test.py", line 504, in <module>
    sys.exit(main())
  File "/Users/mrh/work/Policy-Engine-US/WA/../execute_test.py", line 395, in main
    vdset = VAL()
  File "//Users/mrh/anaconda3/envs/policyengine-us/lib/python3.9/site-packages/policyengine_core/data/dataset.py", line 39, in __init__
    raise ValueError(
ValueError: Dataset file_path must be specified in the dataset class definition.
Traceback (most recent call last):
  File "/Users/mrh/work/Policy-Engine-US/WA/../execute_test.py", line 504, in <module>
    sys.exit(main())
  File "/Users/mrh/work/Policy-Engine-US/WA/../execute_test.py", line 395, in main
    vdset = VAL()
  File "//Users/mrh/anaconda3/envs/policyengine-us/lib/python3.9/site-packages/policyengine_core/data/dataset.py", line 39, in __init__
    raise ValueError(
ValueError: Dataset file_path must be specified in the dataset class definition.
nikhilwoodruff commented 1 year ago

Thanks for identifying Martin- taking a look now.

nikhilwoodruff commented 1 year ago

Just FYI I don't think the GitHub actions are failing because of this (e.g. https://github.com/PolicyEngine/policyengine-us/pull/2001 passes, and is up-to-date with the master branch). I think it's more likely to be the Oklahoma formula (which is where that PR fails)

nikhilwoodruff commented 1 year ago

@martinholmer is your PolicyEngine-Core definitely up-to-date? If not, could you re-run pip install -e . in policyengine-us?

martinholmer commented 1 year ago

@nikhilwoodruff asked:

is your PolicyEngine-Core definitely up-to-date? If not, could you re-run pip install -e . in policyengine-us?

I have 2.0.1 installed. Isn't that up-to-date?

martinholmer commented 1 year ago

@nikhilwoodruff said:

FYI I don't think the GitHub actions are failing because of this (e.g. https://github.com/PolicyEngine/policyengine-us/pull/2001 passes, and is up-to-date with the master branch). I think it's more likely to be the Oklahoma formula (which is where that PR fails)

I get the reported errors on the master branch.

nikhilwoodruff commented 1 year ago

@martinholmer: OK- but I don't think there's an issue with the GH tests because the current tests pass.

I think your Core install is outdated, because if you look at your error message, it prints out the contents of datasets.py, line 40, in PolicyEngine-Core:

  def __init__(self):
        # Setup dataset
        if self.folder_path is None:
>           raise ValueError(
                "Dataset folder_path must be specified in the dataset class definition."
E               ValueError: Dataset folder_path must be specified in the dataset class definition.

If you look at that segment in the Core code, it's different (it now refers to file_path).

So I think we just need to figure out why your install didn't work?

martinholmer commented 1 year ago

@nikhilwoodruff said:

So I think we just need to figure out why your [PolicyEngineCore] install didn't work?

OK. I updated my local master branch and then merged the master branch into my ok-itax branch. This did improve things, but still got one error (with the four test_cps errors now gone). The one remaining error does seem to be caused by my OK code. I'll try to correct that. Thanks for the help, @nikhilwoodruff