Closed khaeru closed 7 months ago
Attention: Patch coverage is 99.53052%
with 1 lines
in your changes are missing coverage. Please review.
Project coverage is 95.4%. Comparing base (
0fb01b4
) to head (edfb57d
).
Thanks for opening this PR :) I don't immediately have time to address these, but here are the flaky tests I found since 11.01.2024:
[ ] FAILED message_ix/tests/test_feature_bound_activity_shares.py::test_add_bound_activity_up - assert False
For details, see the terminal output above, plus: Listing : /Users/runner/work/message_ix/message_ix/message_ix/model/MESSAGE_run.lst Input data: /Users/runner/work/message_ix/message_ix/message_ix/model /data/MsgData_Canningproblem(MESSAGE_scheme)_test_add_bound_activity_up_all_modes_cloned.gdx On: macos-latest-py3.8, macos-latest-py3.11
[ ] FAILED message_ix/tests/test_core.py::test_excel_read_write - assert False
[ ] FAILED message_ix/tests/test_integration.py::test_run_clone - RuntimeError: unhandled Java exception: The index set 'year' does not have an element '2010'! On: macos-latest-py3.11 (x2)
[ ] FAILED message_ix/tests/test_legacy_version.py::test_solve_legacy_scenario - RuntimeError: unhandled Java exception: The index set 'year' does not have an element '1963'! On: windows-latest-py3.11, windows-latest-py3.10
[ ] ERROR message_ix/tests/test_macro.py::test_multiregion_derive_data_2 - RuntimeError: unhandled Java exception: The index set 'emission' does not have an element 'CO2'! On: macos-latest-py3.9
Great. We can add those to the PR checklist up top!
I kept them down here because I recently had to rework a PR and found editing the description tedious since it included a lot of outdated material, so wanted to avoid that here. But they are added now.
Once we address this, we should do something similar for ixmp. Today, the CI there is flaky enough to require four additional runs.
We can also look at the tests we marked as flaky in https://github.com/iiasa/message_ix/issues/731 and try to sustainably 'un-flake' them here.
The latest failures seem to be about these lines: https://github.com/iiasa/message_ix/blob/be4dda40691ee74c358c0cc35f0e7939dda8a42a/message_ix/tests/test_feature_bound_activity_shares.py#L75-L76 and https://github.com/iiasa/message_ix/blob/be4dda40691ee74c358c0cc35f0e7939dda8a42a/message_ix/tests/test_feature_bound_activity_shares.py#L98-L99
When adapting the tests, I had locally at one point scen.solve(quiet=True, lpmethod=2)
, which gave me the exact same error that we're seeing now here. However, on my local system the current version works without error.
However, the error marked as "Job 1" in the PR description also still appears here. However, the other flakiness has not occurred again on the most recent runs.
We still get the same errors with the latest commit as described above, but in addition, we have
FAILED message_ix/tests/test_feature_bound_activity_shares.py::test_commodity_share_lo - assert False
+ where False = <function isclose at 0x000001C6C61DB130>(0.15384615384615385, 0.0)
+ where <function isclose at 0x000001C6C61DB130> = np.isclose
in windows-latest-py3.10 and a tutorial error I've never seen before.
Context for my choice of 'default' lpmethod
value: https://www.gams.com/latest/docs/S_CPLEX.html#CPLEXlpmethod. Though I'm beginning to think it's not actually about this parameter.
One puzzle I ran into here: tests were failing on macOS runners due to a GAMS error at this line: https://github.com/iiasa/message_ix/blob/0fb01b485f8d054c9fe15eda23eee6c07ba2383a/message_ix/model/MESSAGE/data_load.gms#L6
The use of temporary paths and fixtures resulted in a long path %in%
, such that the total length of the string passed to put_utility
was more than 255 characters. This resulted in a GAMS compilation error and overall failure of the test.
The work-around was to shorten the test name.
FAILED message_ix/tests/test_integration.py::test_run_remove_solution - assert False where False = <function isclose at 0x106197490>(nan, 153.675) where <function isclose at 0x106197490> = np.isclose
On: macos-latest-py3.10
This one hasn't been seen recently, but I note that we didn't make any particular change to address it. We can keep an eye open for it recurring in the future.
This PR is to collect changes related to #776. As we notice tests "flaking", we can adjust them on this branch, and then merge once we feel we've addressed all observed flaky tests. The PR will thus remain open for a while until we're convinced we've reached that point.
test_add_bound_activity_up_all_modes
fails withassert np.isclose(300.0, 332.5)
.How to review
PR checklist
[x] Continuous integration checks all ✅
[x] Update release notes.
[x] FAILED message_ix/tests/test_feature_bound_activity_shares.py::test_add_bound_activity_up_all_modes - ixmp.model.base.ModelError: GAMS errored with return code 3: There was an execution error
For details, see the terminal output above, plus: Listing : /Users/runner/work/message_ix/message_ix/message_ix/model/MESSAGE_run.lst Input data: /Users/runner/work/message_ix/message_ix/message_ix/model /data/MsgData_Canningproblem(MESSAGE_scheme)_test_add_bound_activity_up_all_modes_cloned.gdx On: macos-latest-py3.8, macos-latest-py3.11, macos-latest-py3.12