ShobiStassen / PARC

MIT License
41 stars 11 forks source link

Error when running UMAP #16

Open rjesud opened 3 years ago

rjesud commented 3 years ago

Hi, Great package! I am trying to run PARC's UMAP implementation but I receive the following error: image

It appears that this issue has popped up here: https://github.com/theislab/scanpy/issues/1579

I tried stepping back to umap-learn==0.4.5 but I encounter an error when importing PARC.

Any help would be greatly appreciated.

My venv:

_libgcc_mutex 0.1 main
backcall 0.2.0 blas 1.0 mkl
ca-certificates 2021.5.25 h06a4308_1
certifi 2021.5.30 py37h06a4308_0
cycler 0.10.0 py37_0
dbus 1.13.18 hb2f20db_0
decorator 5.0.9 expat 2.4.1 h2531618_2
fontconfig 2.13.1 h6c09931_0
freetype 2.10.4 h5ab3b9f_0
glib 2.68.2 h36276a3_0
gst-plugins-base 1.14.0 h8213a91_2
gstreamer 1.14.0 h28cd5cc_2
hnswlib 0.5.1 icu 58.2 he6710b0_3
intel-openmp 2021.2.0 h06a4308_610
ipykernel 5.5.5 ipython 7.24.1 ipython-genutils 0.2.0 jedi 0.18.0 joblib 1.0.1 pyhd3eb1b0_0
jpeg 9b h024ee3a_2
jupyter-client 6.1.12 jupyter-core 4.7.1 kiwisolver 1.3.1 py37h2531618_0
lcms2 2.12 h3be6417_0
ld_impl_linux-64 2.33.1 h53a641e_7
leidenalg 0.8.4 libffi 3.3 he6710b0_2
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
libpng 1.6.37 hbc83047_0
libstdcxx-ng 9.1.0 hdf63c60_0
libtiff 4.2.0 h85742a9_0
libuuid 1.0.3 h1bed415_2
libwebp-base 1.2.0 h27cfd23_0
libxcb 1.14 h7b6447c_0
libxml2 2.9.10 hb55368b_3
llvmlite 0.36.0 lz4-c 1.9.3 h2531618_0
matplotlib 3.3.4 py37h06a4308_0
matplotlib-base 3.3.4 py37h62a2d02_0
matplotlib-inline 0.1.2 mkl 2021.2.0 h06a4308_296
mkl-service 2.3.0 py37h27cfd23_1
mkl_fft 1.3.0 py37h42c9631_2
mkl_random 1.2.1 py37ha9443f7_2
ncurses 6.2 he6710b0_1
numba 0.53.1 numpy 1.20.2 py37h2d18471_0
numpy-base 1.20.2 py37hfae3a4d_0
libstdcxx-ng 9.1.0 hdf63c60_0
olefile 0.46 py_0
openssl 1.1.1k h27cfd23_0
pandas 1.2.4 py37h2531618_0
parc 0.31 parso 0.8.2 pcre 8.44 he6710b0_0
pexpect 4.8.0 pickleshare 0.7.5 pillow 8.2.0 py37he98fc37_0
pip 21.1.1 py37h06a4308_0
prompt-toolkit 3.0.18 ptyprocess 0.7.0 pybind11 2.6.2 py37hff7bd54_1
Pygments 2.9.0 pynndescent 0.5.2 pyparsing 2.4.7 pyhd3eb1b0_0
pyqt 5.9.2 py37h05f1152_2
python 3.7.10 hdb3f193_0
python-dateutil 2.8.1 pyhd3eb1b0_0
python-igraph 0.9.4 pytz 2021.1 pyhd3eb1b0_0
pyzmq 22.1.0 qt 5.9.7 h5867ecd_1
readline 8.1 h27cfd23_0
scikit-learn 0.24.2 py37ha9443f7_0
scipy 1.6.2 py37had2a1c9_1
setuptools 52.0.0 py37h06a4308_0
sip 4.19.8 py37hf484d3e_0
six 1.15.0 pyhd3eb1b0_0
sqlite 3.35.4 hdfb4753_0
texttable 1.6.3 threadpoolctl 2.1.0 pyh5ca1d4c_0
tk 8.6.10 hbc83047_0
tornado 6.1 py37h27cfd23_0
traitlets 5.0.5 umap-learn 0.5.1 wcwidth 0.2.5 wheel 0.36.2 pyhd3eb1b0_0
xz 5.2.5 h7b6447c_0
zlib 1.2.11 h7b6447c_3
zstd 1.4.9 haebb681_0

ShobiStassen commented 3 years ago

Hi, i have the following suggestion until i get around to doing a proper fix. I just installed and ran PARC with the hybrid umap construction on a clean environment and get it to work by ensuring that 1) numba is version 0.49.1 and 2) umap-learn is 0.4.3

import parc
import matplotlib.pyplot as plt

from sklearn import datasets

iris = datasets.load_iris()
X = iris.data
y = iris.target

p1 = parc.PARC(X, true_label=y, too_big_factor=0.3, resolution_parameter=1, keep_all_local_dist=True)  
p1.run_PARC()
print(type(p1.labels), p1.stats_df)

graph = p1.knngraph_full()
X_umap = p1.run_umap_hnsw(X, graph)
plt.scatter(X_umap[:, 0], X_umap[:, 1], c=p1.labels)
plt.show()
plt.scatter(X_umap[:, 0], X_umap[:, 1], c=y)
plt.show()

(ParcEnv2021) shobi@shobi:~$ conda list packages in environment at /home/shobi/anaconda3/envs/ParcEnv2021:

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_gnu conda-forge ca-certificates 2021.5.30 ha878542_0 conda-forge certifi 2021.5.30 py37h89c1867_0 conda-forge cycler 0.10.0 pypi_0 pypi hnswlib 0.5.1 pypi_0 pypi joblib 1.0.1 pypi_0 pypi kiwisolver 1.3.1 pypi_0 pypi ld_impl_linux-64 2.35.1 hea4e1c9_2 conda-forge leidenalg 0.7.0 pypi_0 pypi libffi 3.3 h58526e2_2 conda-forge libgcc-ng 9.3.0 h2828fa1_19 conda-forge libgomp 9.3.0 h2828fa1_19 conda-forge libstdcxx-ng 9.3.0 h6de172a_19 conda-forge llvmlite 0.32.1 pypi_0 pypi matplotlib 3.4.2 pypi_0 pypi ncurses 6.2 h58526e2_4 conda-forge numba 0.49.1 pypi_0 pypi numpy 1.20.3 pypi_0 pypi openssl 1.1.1k h7f98852_0 conda-forge pandas 1.2.4 pypi_0 pypi parc 0.31 pypi_0 pypi pillow 8.2.0 pypi_0 pypi pip 21.1.2 pyhd8ed1ab_0 conda-forge pybind11 2.6.2 pypi_0 pypi pyparsing 2.4.7 pypi_0 pypi python 3.7.10 hffdb5ce_100_cpython conda-forge python-dateutil 2.8.1 pypi_0 pypi python-igraph 0.9.4 pypi_0 pypi python_abi 3.7 1_cp37m conda-forge pytz 2021.1 pypi_0 pypi readline 8.1 h46c0cb4_0 conda-forge scikit-learn 0.24.2 pypi_0 pypi scipy 1.6.3 pypi_0 pypi setuptools 49.6.0 py37h89c1867_3 conda-forge six 1.16.0 pypi_0 pypi sqlite 3.35.5 h74cdb3f_0 conda-forge tbb 2021.2.0 pypi_0 pypi texttable 1.6.3 pypi_0 pypi threadpoolctl 2.1.0 pypi_0 pypi tk 8.6.10 h21135ba_1 conda-forge umap-learn 0.4.3 pypi_0 pypi wheel 0.36.2 pyhd3deb0d_0 conda-forge xz 5.2.5 h516909a_1 conda-forge zlib 1.2.11 h516909a_1010 conda-forge

ShobiStassen commented 3 years ago

These are the commands in a new venv : conda create --ParcEnv2021 python=3.7 conda install pip pip install matplotlib pip install umap-learn==0.4.3  pip install numba==0.49.1 pip install leidenalg #0.7.0 is faster so you could consider using 0.7.0 version for large data pip install hnswlib pip install python-igraph pip install parc

rjesud commented 3 years ago

Thank you, @ShobiStassen! All is well now.

I received an compilation error when for "pip install leidanalg==0.7.0" but "pip install leidanalg" worked fine. It gave me 0.8.4. I did see your inquiry here: https://github.com/vtraag/leidenalg/issues/35

It seems they fixed the slow down issue??