KrishnaswamyLab / PHATE

PHATE (Potential of Heat-diffusion for Affinity-based Transition Embedding) is a tool for visualizing high dimensional data.
http://phate.readthedocs.io
Other
472 stars 73 forks source link

Error instantiaiating PHATE #111

Closed wumirose closed 2 years ago

wumirose commented 2 years ago

Describe the bug I tried to instantiate and use PHATE object with: data_phate = phate.PHATE().fit_transform(bmmsc_data)

To Reproduce (https://colab.research.google.com/github/KrishnaswamyLab/MAGIC/blob/master/python/tutorial_notebooks/bonemarrow_tutorial.ipynb#scrollTo=398dus7w7tc9)

Expected behavior Calculating PHATE... Running PHATE on 2416 cells and 10782 genes. Calculating graph and diffusion operator... Calculating PCA... Calculated PCA in 5.65 seconds. Calculating KNN search... Calculated KNN search in 0.81 seconds. Calculating affinities... Calculated affinities in 0.03 seconds. Calculated graph and diffusion operator in 6.66 seconds. Calculating landmark operator... Calculating SVD... Calculated SVD in 0.30 seconds. Calculating KMeans... Calculated KMeans in 24.72 seconds. Calculated landmark operator in 26.40 seconds. Calculating optimal t... Calculated optimal t in 6.88 seconds. Calculating diffusion potential... Calculated diffusion potential in 2.86 seconds. Calculating metric MDS... Calculated metric MDS in 37.48 seconds. Calculated PHATE in 80.29 seconds.

Actual behavior Calculating PHATE... Running PHATE on 2416 observations and 10782 variables. Calculating graph and diffusion operator... Calculating PCA... Calculated PCA in 2.50 seconds. Calculating KNN search... Calculated KNN search in 0.50 seconds. Calculating affinities... Calculated affinities in 0.11 seconds. Calculated graph and diffusion operator in 3.17 seconds. Calculating landmark operator... Calculating SVD... Calculated SVD in 0.12 seconds. Calculating KMeans... Calculated KMeans in 0.81 seconds. Calculated landmark operator in 0.93 seconds. Calculated PHATE in 4.10 seconds.

AttributeError Traceback (most recent call last) ~/miniforge3/lib/python3.9/site-packages/graphtools/graphs.py in landmark_op(self) 590 try: --> 591 return self._landmark_op 592 except AttributeError:

AttributeError: 'kNNLandmarkGraph' object has no attribute '_landmark_op'

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last) /var/folders/wp/mg6w1m053n32d4ln0wv61n6w0000gn/T/ipykernel_40966/2632628218.py in 1 phate_op = phate.PHATE() ----> 2 data_phate = phate_op.fit_transform(bmmsc_data)

~/.local/lib/python3.9/site-packages/phate/phate.py in fit_transform(self, X, kwargs) 959 """ 960 with _logger.task("PHATE"): --> 961 self.fit(X) 962 embedding = self.transform(kwargs) 963 return embedding

~/.local/lib/python3.9/site-packages/phate/phate.py in fit(self, X) 855 856 # landmark op doesn't build unless forced --> 857 self.diff_op 858 return self 859

~/.local/lib/python3.9/site-packages/phate/phate.py in diff_op(self) 279 if self.graph is not None: 280 if isinstance(self.graph, graphtools.graphs.LandmarkGraph): --> 281 diff_op = self.graph.landmark_op 282 else: 283 diff_op = self.graph.diff_op

~/miniforge3/lib/python3.9/site-packages/graphtools/graphs.py in landmark_op(self) 591 return self._landmark_op 592 except AttributeError: --> 593 self.build_landmark_op() 594 return self._landmark_op 595

~/miniforge3/lib/python3.9/site-packages/graphtools/graphs.py in build_landmark_op(self) 670 random_state=self.random_state, 671 ) --> 672 self._clusters = kmeans.fit_predict(self.diff_op.dot(VT.T)) 673 674 # transition matrices

~/miniforge3/lib/python3.9/site-packages/sklearn/cluster/_kmeans.py in fit_predict(self, X, y, sample_weight) 1253 Index of the cluster each sample belongs to. 1254 """ -> 1255 return self.fit(X, sample_weight=sampleweight).labels 1256 1257 def fit_transform(self, X, y=None, sample_weight=None):

~/miniforge3/lib/python3.9/site-packages/sklearn/cluster/_kmeans.py in fit(self, X, y, sampleweight) 1940 1941 # Compute inertia on a validation set. -> 1942 , inertia = _labels_inertia_threadpool_limit( 1943 X_valid, 1944 sample_weight_valid,

~/miniforge3/lib/python3.9/site-packages/sklearn/cluster/_kmeans.py in _labels_inertia_threadpool_limit(X, sample_weight, x_squared_norms, centers, n_threads) 753 ): 754 """Same as _labels_inertia but in a threadpool_limits context.""" --> 755 with threadpool_limits(limits=1, user_api="blas"): 756 labels, inertia = _labels_inertia( 757 X, sample_weight, x_squared_norms, centers, n_threads

~/miniforge3/lib/python3.9/site-packages/sklearn/utils/fixes.py in threadpool_limits(limits, user_api) 312 return controller.limit(limits=limits, user_api=user_api) 313 else: --> 314 return threadpoolctl.threadpool_limits(limits=limits, user_api=user_api) 315 316

~/miniforge3/lib/python3.9/site-packages/threadpoolctl.py in init(self, limits, user_api) 169 self._check_params(limits, user_api) 170 --> 171 self._original_info = self._set_threadpool_limits() 172 173 def enter(self):

~/miniforge3/lib/python3.9/site-packages/threadpoolctl.py in _set_threadpool_limits(self) 266 return None 267 --> 268 modules = _ThreadpoolInfo(prefixes=self._prefixes, 269 user_api=self._user_api) 270 for module in modules:

~/miniforge3/lib/python3.9/site-packages/threadpoolctl.py in init(self, user_api, prefixes, modules) 338 339 self.modules = [] --> 340 self._load_modules() 341 self._warn_if_incompatible_openmp() 342 else:

~/miniforge3/lib/python3.9/site-packages/threadpoolctl.py in _load_modules(self) 369 """Loop through loaded libraries and store supported ones""" 370 if sys.platform == "darwin": --> 371 self._find_modules_with_dyld() 372 elif sys.platform == "win32": 373 self._find_modules_with_enum_process_module_ex()

~/miniforge3/lib/python3.9/site-packages/threadpoolctl.py in _find_modules_with_dyld(self) 426 427 # Store the module if it is supported and selected --> 428 self._make_module_from_path(filepath) 429 430 def _find_modules_with_enum_process_module_ex(self):

~/miniforge3/lib/python3.9/site-packages/threadpoolctl.py in _make_module_from_path(self, filepath) 513 if prefix in self.prefixes or user_api in self.user_api: 514 module_class = globals()[module_class] --> 515 module = module_class(filepath, prefix, user_api, internal_api) 516 self.modules.append(module) 517

~/miniforge3/lib/python3.9/site-packages/threadpoolctl.py in init(self, filepath, prefix, user_api, internal_api) 604 self.internal_api = internal_api 605 self._dynlib = ctypes.CDLL(filepath, mode=_RTLD_NOLOAD) --> 606 self.version = self.get_version() 607 self.num_threads = self.get_num_threads() 608 self._get_extra_info()

~/miniforge3/lib/python3.9/site-packages/threadpoolctl.py in get_version(self) 644 lambda: None) 645 get_config.restype = ctypes.c_char_p --> 646 config = get_config().split() 647 if config[0] == b"OpenBLAS": 648 return config[1].decode("utf-8")

AttributeError: 'NoneType' object has no attribute 'split'

System information:

Output of phate.__version__: '1.0.7'

Output of pd.show_versions():

INSTALLED VERSIONS

commit : 5f648bf1706dd75a9ca0d29f26eadfbb595fe52b python : 3.9.7.final.0 python-bits : 64 OS : Darwin OS-release : 21.2.0 Version : Darwin Kernel Version 21.2.0: Sun Nov 28 20:29:10 PST 2021; root:xnu-8019.61.5~1/RELEASE_ARM64_T8101 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.3.2 numpy : 1.22.1 pytz : 2021.3 dateutil : 2.8.2 pip : 22.0.3 setuptools : 60.2.0 Cython : None pytest : None hypothesis : None sphinx : 4.4.0 blosc : None feather : 0.4.1 xlsxwriter : None lxml.etree : 4.6.5 html5lib : None pymysql : None psycopg2 : None jinja2 : 3.0.2 IPython : 7.29.0 pandas_datareader: None bs4 : 4.10.0 bottleneck : 1.3.2 fsspec : 2022.01.0 fastparquet : None gcsfs : None matplotlib : 3.3.3 numexpr : 2.8.1 odfpy : None openpyxl : None pandas_gbq : None pyarrow : 7.0.0 pyxlsb : None s3fs : None scipy : 1.7.1 sqlalchemy : None tables : 3.6.1 tabulate : None xarray : None xlrd : None xlwt : None numba : 0.53.0

Additional context Any help will be greatly appreciated

scottgigante commented 2 years ago

I just ran this notebook and it works perfectly. Appears this was a sklearn or threadpoolctl bug which has since been fixed.

wumirose commented 2 years ago

I ran these two commands

!pip install threadpoolctl
!pip install sklearn

..and all requirements are already satisfied. However, the error persists.

AttributeError: 'NoneType' object has no attribute 'split'

I have just started my journey into bioinformatics/Computational Biology and Phate has been one of the amazing tools recently introduced by my professor. I will appreciate any help to get it running on my MacBook.

scottgigante commented 2 years ago

Try running pip install --upgrade threadpoolctl scikit-learn instead. If that doesn't work, please write a minimum reproducible example.

wumirose commented 2 years ago

Try running pip install --upgrade threadpoolctl scikit-learn instead. If that doesn't work, please write a minimum reproducible example.

Thanks.

I did data_phate = phate_op.fit_transform(bmmsc_data) in the tutorial notebook, yet I got the same error

AttributeError: 'NoneType' object has no attribute 'split'.

I used another dataset and surprisingly it worked!