cognizant-ai-labs / covid-xprize

Open-source repository containing examples and documentation for the Cognizant XPRIZE Pandemic Response Challenge
Other
37 stars 76 forks source link

Scenario generator: missing dates #111

Closed ofrancon closed 4 years ago

ofrancon commented 4 years ago

When countries don't have the same number of dates in the Oxford dataset, some countries might be missing days in their scenarios. Because of that 2 unit tests failed:

======================================================================
FAIL: test_generate_scenario_all_countries_future_from_last_known_date_freeze (covid_xprize.validation.tests.test_scenario_generator.TestScenarioGenerator)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/codefresh/volume/covid-xprize/covid_xprize/validation/tests/test_scenario_generator.py", line 281, in test_generate_scenario_all_countries_future_from_last_known_date_freeze
    region=region)
  File "/codefresh/volume/covid-xprize/covid_xprize/validation/tests/test_scenario_generator.py", line 304, in _check_future
    pd.testing.assert_frame_equal(historical_df, past_df, "Not the expected past NPIs")
  File "/usr/local/lib/python3.6/site-packages/pandas/_testing.py", line 1562, in assert_frame_equal
    obj, f"{obj} shape mismatch", f"{repr(left.shape)}", f"{repr(right.shape)}",
  File "/usr/local/lib/python3.6/site-packages/pandas/_testing.py", line 1036, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: DataFrame are different

DataFrame shape mismatch
[left]:  (309, 12)
[right]: (308, 12)

======================================================================
FAIL: test_generate_scenario_mind_the_gap_freeze_all_countries (covid_xprize.validation.tests.test_scenario_generator.TestScenarioGenerator)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/codefresh/volume/covid-xprize/covid_xprize/validation/tests/test_scenario_generator.py", line 406, in test_generate_scenario_mind_the_gap_freeze_all_countries
    f"Not the expected number of rows in the generated scenario:"
AssertionError: 120528 != 120527 : Not the expected number of rows in the generated scenario: 243 geos times 496 days

----------------------------------------------------------------------
Ran 34 tests in 294.714s

FAILED (failures=2)
ofrancon commented 4 years ago

Issue is that we assumed each country has the same number of days in the Oxford dataset. This is not true, especially when brand new data is added for some countries but not for others (not available yet), and hence dates must be handled by country/region.