lmcinnes / pynndescent

A Python nearest neighbor descent for approximate nearest neighbors
BSD 2-Clause "Simplified" License
897 stars 105 forks source link

DistributionNotFound: The 'pynndescent' distribution was not found and is required by the application #144

Open yjkweon24 opened 3 years ago

yjkweon24 commented 3 years ago

Using python 3.8.10 with umap-learn 0.5.0, scanpy 1.8.1, pynndescent 0.5.2+computecanada, numba 0.51.2, llvmlite 0.34.0+computecanada. And, I am running sc.pp.neighbors() from scanpy module. (on computecanada server) By the way, I am using Mac os. And, this is the error I got, below:


DistributionNotFound Traceback (most recent call last) /tmp/ipykernel_30555/956378397.py in ----> 1 sc.pp.neighbors(aAMLallsc, n_neighbors = 6, n_pcs =3)

~/.local/lib/python3.8/site-packages/scanpy/neighbors/init.py in neighbors(adata, n_neighbors, n_pcs, use_rep, knn, random_state, method, metric, metric_kwds, key_added, copy) 137 adata._init_as_actual(adata.copy()) 138 neighbors = Neighbors(adata) --> 139 neighbors.compute_neighbors( 140 n_neighbors=n_neighbors, 141 knn=knn,

~/.local/lib/python3.8/site-packages/scanpy/neighbors/init.py in compute_neighbors(self, n_neighbors, knn, n_pcs, use_rep, method, random_state, write_knn_indices, metric, metric_kwds) 806 # we need self._distances also for method == 'gauss' if we didn't 807 # use dense distances --> 808 self._distances, self._connectivities = _compute_connectivities_umap( 809 knn_indices, 810 knn_distances,

~/.local/lib/python3.8/site-packages/scanpy/neighbors/init.py in _compute_connectivities_umap(knn_indices, knn_dists, n_obs, n_neighbors, set_op_mix_ratio, localconnectivity) 385 # umap 0.5.0 386 warnings.filterwarnings("ignore", message=r"Tensorflow not installed") --> 387 from umap.umap import fuzzy_simplicial_set 388 389 X = coo_matrix(([], ([], [])), shape=(n_obs, 1))

~/.local/lib/python3.8/site-packages/umap/init.py in 1 from warnings import warn, catchwarnings, simplefilter ----> 2 from .umap import UMAP 3 4 try: 5 with catch_warnings():

~/.local/lib/python3.8/site-packages/umap/umap_.py in 45 ) 46 ---> 47 from pynndescent import NNDescent 48 from pynndescent.distances import named_distances as pynn_named_distances 49 from pynndescent.sparse import sparse_named_distances as pynn_sparse_named_distances

~/.local/lib/python3.8/site-packages/pynndescent/init.py in 6 numba.config.THREADING_LAYER = "workqueue" 7 ----> 8 version = pkg_resources.get_distribution("pynndescent").version

~/jupyter_py3/lib/python3.8/site-packages/pkg_resources/init.py in get_distribution(dist) 480 dist = Requirement.parse(dist) 481 if isinstance(dist, Requirement): --> 482 dist = get_provider(dist) 483 if not isinstance(dist, Distribution): 484 raise TypeError("Expected string, Requirement, or Distribution", dist)

~/jupyter_py3/lib/python3.8/site-packages/pkg_resources/init.py in get_provider(moduleOrReq) 356 """Return an IResourceProvider for the named module or requirement""" 357 if isinstance(moduleOrReq, Requirement): --> 358 return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0] 359 try: 360 module = sys.modules[moduleOrReq]

~/jupyter_py3/lib/python3.8/site-packages/pkg_resources/init.py in require(self, *requirements) 899 included, even if they were already activated in this working set. 900 """ --> 901 needed = self.resolve(parse_requirements(requirements)) 902 903 for dist in needed:

~/jupyter_py3/lib/python3.8/site-packages/pkg_resources/init.py in resolve(self, requirements, env, installer, replace_conflicting, extras) 785 if dist is None: 786 requirers = required_by.get(req, None) --> 787 raise DistributionNotFound(req, requirers) 788 to_activate.append(dist) 789 if dist not in req:

DistributionNotFound: The 'pynndescent' distribution was not found and is required by the application

jamestwebber commented 2 years ago

~/.local/lib/python3.8/site-packages/pynndescent/init.py in 6 numba.config.THREADING_LAYER = "workqueue" 7 ----> 8 version = pkg_resources.get_distribution("pynndescent").version

~/jupyter_py3/lib/python3.8/site-packages/pkg_resources/init.py in get_distribution(dist)

Looking at these two lines, it appears that python suddenly found a different site-packages directory to look in, and in that directory pynndescent is not installed. I'm not sure how that could happen but it seems like an environment issue.

adamgayoso commented 2 years ago

Something similar is happening in google colab:

``` --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in _dep_map(self) 3015 PKG_INFO = 'METADATA' -> 3016 EQEQ = re.compile(r"([\(,])\s*(\d.*?)\s*([,\)])") 3017 19 frames [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in __getattr__(self, attr) 2812 if self.location: -> 2813 return "%s (%s)" % (self, self.location) 2814 else: AttributeError: _DistInfoDistribution__dep_map During handling of the above exception, another exception occurred: AttributeError Traceback (most recent call last) [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in _parsed_pkg_info(self) 3006 self._version = md_version -> 3007 return self 3008 [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in __getattr__(self, attr) 2812 if self.location: -> 2813 return "%s (%s)" % (self, self.location) 2814 else: AttributeError: _pkg_info During handling of the above exception, another exception occurred: FileNotFoundError Traceback (most recent call last) [](https://localhost:8080/#) in () 1 # run PCA then generate UMAP plots 2 sc.tl.pca(adata, svd_solver='arpack') ----> 3 sc.pp.neighbors(adata, n_pcs=30, n_neighbors=20) 4 sc.tl.umap(adata, min_dist=0.3) [/usr/local/lib/python3.7/dist-packages/scanpy/neighbors/__init__.py](https://localhost:8080/#) in neighbors(adata, n_neighbors, n_pcs, use_rep, knn, random_state, method, metric, metric_kwds, key_added, copy) 145 metric=metric, 146 metric_kwds=metric_kwds, --> 147 random_state=random_state, 148 ) 149 [/usr/local/lib/python3.7/dist-packages/scanpy/neighbors/__init__.py](https://localhost:8080/#) in compute_neighbors(self, n_neighbors, knn, n_pcs, use_rep, method, random_state, write_knn_indices, metric, metric_kwds) 790 metric = 'precomputed' 791 knn_indices, knn_distances, forest = compute_neighbors_umap( --> 792 X, n_neighbors, random_state, metric=metric, metric_kwds=metric_kwds 793 ) 794 # very cautious here [/usr/local/lib/python3.7/dist-packages/scanpy/neighbors/__init__.py](https://localhost:8080/#) in compute_neighbors_umap(X, n_neighbors, random_state, metric, metric_kwds, angular, verbose) 299 # umap 0.5.0 300 warnings.filterwarnings("ignore", message=r"Tensorflow not installed") --> 301 from umap.umap_ import nearest_neighbors 302 303 random_state = check_random_state(random_state) [/usr/local/lib/python3.7/dist-packages/umap/__init__.py](https://localhost:8080/#) in () 1 from warnings import warn, catch_warnings, simplefilter ----> 2 from .umap_ import UMAP 3 4 try: 5 with catch_warnings(): [/usr/local/lib/python3.7/dist-packages/umap/umap_.py](https://localhost:8080/#) in () 45 ) 46 ---> 47 from pynndescent import NNDescent 48 from pynndescent.distances import named_distances as pynn_named_distances 49 from pynndescent.sparse import sparse_named_distances as pynn_sparse_named_distances [/usr/local/lib/python3.7/dist-packages/pynndescent/__init__.py](https://localhost:8080/#) in () 13 numba.config.THREADING_LAYER = "workqueue" 14 ---> 15 __version__ = pkg_resources.get_distribution("pynndescent").version [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in get_distribution(dist) 464 465 --> 466 def get_distribution(dist): 467 """Return a current distribution object for a Requirement or string""" 468 if isinstance(dist, str): [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in get_provider(moduleOrReq) 340 """ 341 _provider_factories[loader_type] = provider_factory --> 342 343 344 def get_provider(moduleOrReq): [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in require(self, *requirements) 884 885 `requirements` must be a string or a (possibly-nested) sequence --> 886 thereof, specifying the distributions and versions required. The 887 return value is a sequence of the distributions that needed to be 888 activated to fulfill the requirements; all relevant distributions are [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in resolve(self, requirements, env, installer, replace_conflicting, extras) 778 to_activate.append(dist) 779 if dist not in req: --> 780 # Oops, the "best" so far conflicts with a dependency 781 dependent_req = required_by[req] 782 raise VersionConflict(dist, req).with_context(dependent_req) [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in requires(self, extras) 2732 ) 2733 if fails_marker: -> 2734 reqs = [] 2735 new_extra = safe_extra(new_extra) or None 2736 [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in _dep_map(self) 3016 EQEQ = re.compile(r"([\(,])\s*(\d.*?)\s*([,\)])") 3017 -> 3018 @property 3019 def _parsed_pkg_info(self): 3020 """Parse and cache metadata""" [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in _compute_dependencies(self) 3025 self._pkg_info = email.parser.Parser().parsestr(metadata) 3026 return self._pkg_info -> 3027 3028 @property 3029 def _dep_map(self): [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in _parsed_pkg_info(self) 3007 return self 3008 -> 3009 3010 class DistInfoDistribution(Distribution): 3011 """ [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in get_metadata(self, name) 1405 path = self._get_metadata_path(name) 1406 return self._has(path) -> 1407 1408 def get_metadata(self, name): 1409 if not self.egg_info: [/usr/local/lib/python3.7/dist-packages/pkg_resources/__init__.py](https://localhost:8080/#) in _get(self, path) 1609 def _listdir(self, path): 1610 return os.listdir(path) -> 1611 1612 def get_resource_stream(self, manager, resource_name): 1613 return open(self._fn(self.module_path, resource_name), 'rb') FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.7/dist-packages/setuptools-57.4.0.dist-info/METADATA' ```
jamestwebber commented 2 years ago

Huh I've never seen dist-packages before, I guess that's a Debian thing. At least in this case it's only looking in one location. Perhaps pynndescent was installed to site-packages?

adamgayoso commented 2 years ago

the strange thing is if you run it again and again eventually it works...

maybe it can be switched to something like this?

https://github.com/YosefLab/scvi-tools/blob/188de4b35f8b9c12dbfa52de65773463c1ba7048/scvi/__init__.py#L14-L19

jamestwebber commented 2 years ago

Ah, something like #137 ? 🙂

adamgayoso commented 2 years ago

@jamestwebber YES! Would strongly favor this to get in + release :)

jamestwebber commented 2 years ago

Sadly I do not have the power and @lmcinnes is a busy guy, but hopefully he can merge it before the next release.