Closed PGijsbers closed 3 weeks ago
edit: no longer relevant
Test failures seem to be caused by exceeding the 10 minute time limit we have set. I am not sure why these tests need as long, but it looks like they fail and are repeated. It's likely their failure is deterministic however. So I will go ahead and disable retries so the test does a hard fail instead with a more meaningful error message (hopefully).
Attention: Patch coverage is 66.66667%
with 2 lines
in your changes missing coverage. Please review.
Project coverage is 83.95%. Comparing base (
923b49d
) to head (2c161e4
).
Files | Patch % | Lines |
---|---|---|
openml/extensions/sklearn/extension.py | 60.00% | 2 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Tests are incredibly slow, I am not sure why. This also happens locally, so it's not just the runners. I suspect it's the server.
edit: it looks like the culprit is mostly the production server :( (most (all?) of these long tests use prod)
edit: message outdated, pr now fully supports numpy 2.0.
old message:
On Numpy 2.0 support/scikit-learn support (from #1341):
Numpy2.0 cleaned up their namespace, so we have to alias for the new version. The problem is that this is not covered in CI (tested versions do not support numpy 2.0), and in general tests need to be updated to support more recent scikit-learn versions that support 2.0 (1.5, maybe 1.4?). I don't know if I really have the time for that right now. Options:
- Put a pin in the dependencies
- Accept this as a best-effort, and hope that the tests that don't fail due to scikit-learn incompatibility sufficiently cover the code to raise any numpy1/2 issues
- Wait until I find time to make tests support scikit-learn 1.5
But I am currently actually leaning towards putting in a bit of extra time to make sure we can test on later scikit-learn versions.
@LennartPurucker I pinged you for a review since at this point I am unsure what to do. The work is done, kind of. Sometimes the tests still fail sporadically because the pytest workers get killed on GitHub actions. I have not had this happen locally. It's also not consistent - they can also pass. At this point, there are already large improvements to the tests and CI. It's probably already worth merging. From there, I can work on both new features/updates as well as debug, but then at least that can be in parallel and be discussed in separate smaller PRs. What do you think?
edit: I suspect this may also be due to the timeout, though previously I have had clearer messages when the timeout functionality was triggered. I'll remove the timeouts so we can compare CI results.
edit2: The one failing test (sporadically) is due to an internal timeout now. I think it's fairly safe to say that this has to be a server side issue, with the same test passing on other instances.
This PR updates tests to be compatible with newer versions of scikit-learn. In the process, fixes some bugs and adds numpy 2.0 support. I updated the CI matrix to include newer versions of scikit-learn. For the most part, openml-python works fine on newer versions, it was just that the unit tests themselves had hard dependencies on specifics. I extended/updated in the most cases, but in a few it was a little unreasonable (needs to be refactored to make it less sensitive to small scikit-learn api changes). I already spent quite a bit of time on this, so I hope that the proposed changes are good enough for now, so we can work on a 0.15.0 release.
The docs fail in the check-links step. I propose to fix this in a separate PR.