scikit-learn-contrib / scikit-matter

A collection of scikit-learn compatible utilities that implement methods born out of the materials science and chemistry communities
https://scikit-matter.readthedocs.io/en/v0.2.0/
BSD 3-Clause "New" or "Revised" License
76 stars 20 forks source link

Fix import errors introduced by sklearn 1.5 #229

Closed agoscinski closed 4 months ago

agoscinski commented 4 months ago

With sklearn version 1.5 the check_pandas_support function has been moved to util._optional_dependencies. This has been adapted in this PR. I would still try to be backwards compatible to 1.* so I put an if then else in the import

Contributor (creator of PR) checklist

For Reviewer


📚 Documentation preview 📚: https://scikit-matter--229.org.readthedocs.build/en/229/

PicoCentauri commented 4 months ago

Some issues seems to be unrelated. Do you have an idea @agoscinski ?

agoscinski commented 4 months ago

There is a test error in the DCH test. I assume we test something that is not well defined like an orthogonal vector that can have some sign flips, because the difference in the regression test is a sign flip. Then with a change in numpy/scipy version the sign can easily change. A projection of the function that needs to be interpolated it looks already not so numerically stable debug_dch

But honestly I don't know what exactly is happening. We could replace these random points with a smooth high dimensional function, and make this regression test a bit more extensive (testing all points not only one), so we can be sure that we did not just fix this by picking a point that works.

If we need a fast fix (to not block other PRs), I would put a warning now into the code, because if something was broken, then it was already broken before the sklearn 1.5.0 release

ceriottm commented 4 months ago

There is a test error in the DCH test. I assume we test something that is not well defined like an orthogonal vector that can have some sign flips, because the difference in the regression test is a sign flip. Then with a change in numpy/scipy version the sign can easily change. A projection of the function that needs to be interpolated it looks already not so numerically stable debug_dch

But honestly I don't know what exactly is happening. We could replace these random points with a smooth high dimensional function, and make this regression test a bit more extensive (testing all points not only one), so we can be sure that we did not just fix this by picking a point that works.

If we need a fast fix (to not block other PRs), I would put a warning now into the code, because if something was broken, then it was already broken before the sklearn 1.5.0 release

Let me test this to see if I can understand what's the problem. I don't think we should merge when tests are broken - if there's an issue with the test logic we should change the logic.

agoscinski commented 4 months ago

Let me test this to see if I can understand what's the problem. I don't think we should merge when tests are broken - if there's an issue with the test logic we should change the logic.

My point is that, this PR has nothing to do with this issue. It just made it visible. But if you can fix it fast, sure go ahead. I would just not wait another week, since other PRs are in the pipeline.

ceriottm commented 4 months ago

Fixed the test.

agoscinski commented 4 months ago

I am missing something. I can understand that PCA has a sign flip because of orthogonalization but how can the residual be negative if it was fitted on the same data (self.T) even if PCA has a sign flip.

agoscinski commented 4 months ago

I think my misunderstanding came from ignoring the fact that the convex hull is created on self.T and self.y thereby the residuals in a subspace (of self.T) can be nonzero . And then a flip in the sign of the principal components can actually effect on the residuals

agoscinski commented 4 months ago

Thanks @ceriottm for fixing it! I just rephrased the commit message a bit. Will merge as soon as CI passes