Teichlab / bbknn

Batch balanced KNN
MIT License
149 stars 25 forks source link

`ImportError` for `scikit_learn` >= 1.0.0 #54

Closed matchy233 closed 1 year ago

matchy233 commented 1 year ago

We are currently trying to import DistanceMetrics from sklearn.neighbors in bbknn/matrix.py (link to code) but this is only possible for scikit_learn<1.0.0. DistanceMetrics is moved to sklearn.metric in the later versions of sklearn.

I will raise a PR for fixing this via handling the ImportError -- but would like to hear opinions from the maintainers for sure!

ryan-williams commented 1 year ago

sklearn.neighbors.DistanceMetric was removed in sklearn 1.3.0 (released 2023-06-30; here it is in 1.2.2, gone in 1.3.0, removal commit).

Seems like you're supposed to use sklearn.metrics.DistanceMetric, which is present since sklearn 1.0.2.

A fix here could be to update the import, and constrain the sklearn requirement to >=1.0.2.

ryan-williams commented 1 year ago

For easier searching, the error that led me here was:

from sklearn.neighbors import DistanceMetric
# ImportError: cannot import name 'DistanceMetric' from 'sklearn.neighbors' (…/lib/python3.9/site-packages/sklearn/neighbors/__init__.py)

You either have to pin scikit-learn<1.3, or change the import:

-from sklearn.neighbors import DistanceMetric
+from sklearn.metrics import DistanceMetric
ktpolanski commented 1 year ago

Thank you both for bringing this to my attention. I've merged Ryan's pull request (sklearn 1.0.2 is over a year old at this point so it feels okay to require it), but will credit you both in the change log.

bio-la commented 1 year ago

Hi, thanks for the timely fix! please note that this is still an issue if installing bbknn from conda or pip.

ktpolanski commented 1 year ago

I think I pretty much just have docstring updates left in terms of tying up loose ends.

ktpolanski commented 1 year ago

Alright, 1.6.0 is now live on pip!

bio-la commented 1 year ago

it seems something is not right cause trying to run it after installation from pip results in segmentation fault. no other error messages are issued and I'm working on a subset of ~2000 cells so i don't really think it's a memory issue. any idea what to check? thanks @ktpolanski !!

ktpolanski commented 1 year ago

It's probably annoy acting up. Try to run with computation="cKDTree" and/or install a different annoy version

bio-la commented 1 year ago

yes i somehow was breaking some dependencies, i installed all in a new conda env and it works. WARNING: consider updating your call to make use ofcomputation` is indeed sorted with "cKDTree" (but not if calling bbknn fromscanpy.external` so that might need updating) thank you!