catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 105 forks source link

Review XPASSing tests #3251

Open zaneselvans opened 5 months ago

zaneselvans commented 5 months ago

We have a number of tests or data validations which are reporting XPASS in the nightly builds, which we might want to review, in case they should no longer be marked XFAIL:

@katie-lamb are the XFAIL labels for these FERC-EIA test/train tests still required / appropriate?

@zschira or @katie-lamb what do you think about the duplicate FERC years in a plant ID test? Should that be passing now?

katie-lamb commented 5 months ago

@zaneselvans A couple things re test/integration/ferc1_eia_train_test.py:

katie-lamb commented 5 months ago

I'll let @zschira chime in on this as well but I think the test_dupe_years_in_plant_id_ferc1 test should now legitimately be passing.

zaneselvans commented 5 months ago

@katie-lamb I don't understand what you're saying about the first XPASS. It's a real bug... but the test is passing?

It seems like these tests depend on substantial amounts of external real data right now. Are they intended only to test the functionality of restrict_train_connections_on_date_range() and validate_override_fixes() in which case it could be entirely synthetic and turned into a unit test?

Are some of these xfail markers really checking for expected failures because we're giving them known to be bad inputs which should fail? If so, then we might want to change them into tests that pass when the expected exception is raised or failure condition is observed. I couldn't tell from the messages what we really expected to be happening.

zschira commented 5 months ago

test_dupe_years_in_plant_id_ferc1 should definitely pass now. I'll remove the xfail in my open PR.

katie-lamb commented 5 months ago

It seems like these tests depend on substantial amounts of external real data right now. Are they intended only to test the functionality of restrict_train_connections_on_date_range() and validate_override_fixes() in which case it could be entirely synthetic and turned into a unit test?

I didn't write these tests so this is only my guess, but yes I think they test the functionality of those functions and ideally would be entirely synthetic and turned into a unit test. This is what @aesharpe was referencing in the module docstring. The external real data that it depends on is the training data (contained in the package_data) so I think it would need to be refactored so that synthetic training data could be passed in.

zaneselvans commented 5 months ago

If these are really meant to be unit tests, that would be great!

Are there additional integration tests that should be run when all the data is processed? Or are there already defensive checks in the EIA-FERC entity matching code that makes sure things aren't out of whack?

katie-lamb commented 5 months ago

Probably all the consistency checks (implemented as assertions in the FERC to EIA module) should be moved into a model validation module or integration tests but I think we were planning for that after the experiment tracking goes in.

katie-lamb commented 5 months ago

Ok I put the changes that need to be made to address the larger problem with this test into #3256 .

If we feel strongly about making these tests XFAIL right now, then #3257 is a patch to fix the following:

test/integration/ferc1_eia_train_test.py::test_validate_override_fixes[verified4-report_year4-record_id_eia_override_14-record_id_ferc14-utility_id_pudl_ferc14]
test/integration/ferc1_eia_train_test.py::test_validate_override_fixes[verified5-report_year5-record_id_eia_override_15-record_id_ferc15-utility_id_pudl_ferc15]
test/integration/ferc1_eia_train_test.py::test_validate_override_fixes[verified6-report_year6-record_id_eia_override_16-record_id_ferc16-utility_id_pudl_ferc16]