Review XPASSing tests - Githubissues

zaneselvans commented 5 months ago

We have a number of tests or data validations which are reporting XPASS in the nightly builds, which we might want to review, in case they should no longer be marked XFAIL:

[x] test/validate/mcoe_test.py::test_no_null_rows_mcoe[eia_annual-mcoe-0.8] (I think this can be removed, see #2033)
[x] test/validate/mcoe_test.py::test_no_null_rows_mcoe[eia_monthly-mcoe-0.8] (I think this can be removed, see #2033)
[x] test/integration/zenodo_datapackage_test.py::TestZenodoDatapackages::test_prod_datapackages (This is here because Zenodo can be flaky w/ network hiccups)
[x] test/validate/plants_steam_ferc1_test.py::test_dupe_years_in_plant_id_ferc1[ferc1_annual] (This should be passing now, @zschira will fix in #3247)
[x] test/integration/ferc1_eia_train_test.py::test_validate_override_fixes[verified4-report_year4-record_id_eia_override_14-record_id_ferc14-utility_id_pudl_ferc14] See #3257
[x] test/integration/ferc1_eia_train_test.py::test_validate_override_fixes[verified5-report_year5-record_id_eia_override_15-record_id_ferc15-utility_id_pudl_ferc15] See #3257
[x] test/integration/ferc1_eia_train_test.py::test_validate_override_fixes[verified6-report_year6-record_id_eia_override_16-record_id_ferc16-utility_id_pudl_ferc16] See #3257
[ ] test/integration/ferc1_eia_train_test.py::test_validate_override_fixes[verified3-report_year3-record_id_eia_override_13-record_id_ferc13-utility_id_pudl_ferc13]

@katie-lamb are the XFAIL labels for these FERC-EIA test/train tests still required / appropriate?

@zschira or @katie-lamb what do you think about the duplicate FERC years in a plant ID test? Should that be passing now?

katie-lamb commented 5 months ago

@zaneselvans A couple things re test/integration/ferc1_eia_train_test.py:

The first of the XPASSing test_validate_override_fixes tests ("Duplicate FERC1 ids in overrides") doesn't seem to actually be a problem in the validate_override_fixes function that this test is testing. I think this is a legitimate issue with the validate_override_fixes function that should be fixed, but the bug seems to be have been there since it's inception, not from the recently merged refactor.
In the remaining XPASSing test_validate_override_fixes the parameters expect_override_overrides and allow_mismatched_utilities are set to True in that test so these XPASSing tests ("EIA id already in training data.", "FERC1 id already in training data", "Utilities don't match") aren't actually getting tested/should always XPASS. It seems like it's always been that way, and maybe when this test originally went in those parameters were supposed to be set to False. When I set them to False, those tests XFAIL as expected. I can make a PR that makes those parameters less confusing and sets them to the appropriate value.
Finally, there's a note in the module docstring of those tests about how the record IDs used in these tests need to be updated annually to use the years from the fast_etl. It would probably just make sense to synthesize some training data so that doesn't need to be updated, or generally refactor these tests.

katie-lamb commented 5 months ago

I'll let @zschira chime in on this as well but I think the test_dupe_years_in_plant_id_ferc1 test should now legitimately be passing.

zaneselvans commented 5 months ago

@katie-lamb I don't understand what you're saying about the first XPASS. It's a real bug... but the test is passing?

It seems like these tests depend on substantial amounts of external real data right now. Are they intended only to test the functionality of restrict_train_connections_on_date_range() and validate_override_fixes() in which case it could be entirely synthetic and turned into a unit test?

Are some of these xfail markers really checking for expected failures because we're giving them known to be bad inputs which should fail? If so, then we might want to change them into tests that pass when the expected exception is raised or failure condition is observed. I couldn't tell from the messages what we really expected to be happening.

zschira commented 5 months ago

test_dupe_years_in_plant_id_ferc1 should definitely pass now. I'll remove the xfail in my open PR.

katie-lamb commented 5 months ago

It seems like these tests depend on substantial amounts of external real data right now. Are they intended only to test the functionality of restrict_train_connections_on_date_range() and validate_override_fixes() in which case it could be entirely synthetic and turned into a unit test?

I didn't write these tests so this is only my guess, but yes I think they test the functionality of those functions and ideally would be entirely synthetic and turned into a unit test. This is what @aesharpe was referencing in the module docstring. The external real data that it depends on is the training data (contained in the package_data) so I think it would need to be refactored so that synthetic training data could be passed in.

zaneselvans commented 5 months ago

If these are really meant to be unit tests, that would be great!

Are there additional integration tests that should be run when all the data is processed? Or are there already defensive checks in the EIA-FERC entity matching code that makes sure things aren't out of whack?

katie-lamb commented 5 months ago

Probably all the consistency checks (implemented as assertions in the FERC to EIA module) should be moved into a model validation module or integration tests but I think we were planning for that after the experiment tracking goes in.

katie-lamb commented 5 months ago

Ok I put the changes that need to be made to address the larger problem with this test into #3256 .

If we feel strongly about making these tests XFAIL right now, then #3257 is a patch to fix the following:

test/integration/ferc1_eia_train_test.py::test_validate_override_fixes[verified4-report_year4-record_id_eia_override_14-record_id_ferc14-utility_id_pudl_ferc14]
test/integration/ferc1_eia_train_test.py::test_validate_override_fixes[verified5-report_year5-record_id_eia_override_15-record_id_ferc15-utility_id_pudl_ferc15]
test/integration/ferc1_eia_train_test.py::test_validate_override_fixes[verified6-report_year6-record_id_eia_override_16-record_id_ferc16-utility_id_pudl_ferc16]

catalyst-cooperative / pudl

Review XPASSing tests #3251