scikit-learn-contrib / hdbscan

A high performance implementation of HDBSCAN clustering.
http://hdbscan.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
2.8k stars 502 forks source link

Error while importing hdbscan #15

Open s0j0urn opened 8 years ago

s0j0urn commented 8 years ago

I am getting the following error while trying to do : import hdbscan

I am on a windows 7 machine and with a 64 bit python installation using conda

Error :

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-12-3f1c79fb1b69> in <module>()
----> 1 import hdbscan

c:\anaconda2\lib\site-packages\hdbscan-0.6.2-py2.7-macosx-10.5-x86_64.egg\hdbscan\__init__.py in <module>()

c:\anaconda2\lib\site-packages\hdbscan-0.6.2-py2.7-macosx-10.5-x86_64.egg\hdbscan\hdbscan_.py in <module>()

c:\anaconda2\lib\site-packages\hdbscan-0.6.2-py2.7-macosx-10.5-x86_64.egg\hdbscan\_hdbscan_linkage.py in <module>()

c:\anaconda2\lib\site-packages\hdbscan-0.6.2-py2.7-macosx-10.5-x86_64.egg\hdbscan\_hdbscan_linkage.py in __bootstrap__()

ImportError: DLL load failed: %1 is not a valid Win32 application.
lmcinnes commented 8 years ago

I'm a little new to building conda packages; unfortunately it looks like I still have some learning to do. This is a fault of the conda packages I built. If you get your requirements installed via conda (scikit-learn and cython primarily) and then use pip install or install from source things should work out. Thanks for the report. I'll have to look into the conda packaging issues and see if I can understand what's going wrong there.

s0j0urn commented 8 years ago

Thank you for such a quick reply.

The install still did not work. I have tried it with both conda 32-bit package as well as 64-bit package. Is there any other way I can get it to work.

In case of pip install, error is:

[Anaconda2] C:\Users\user_name>pip install hdbscan
Collecting hdbscan
  Using cached hdbscan-0.6.2.tar.gz
Requirement already satisfied (use --upgrade to upgrade): scikit-learn>=0.16 in
c:\anaconda2\lib\site-packages (from hdbscan)
Requirement already satisfied (use --upgrade to upgrade): cython>=0.17 in c:\ana
conda2\lib\site-packages (from hdbscan)
Building wheels for collected packages: hdbscan
  Running setup.py bdist_wheel for hdbscan
  Complete output from command C:\Anaconda2\python.exe -c "import setuptools;__f
ile__='c:\\users\\user_name\\appdata\\local\\temp\\pip-build-yf7q3s\\hdbscan\\setup.
py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))
" bdist_wheel -d c:\users\user_name\appdata\local\temp\tmpbmg3edpip-wheel-:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-2.7
  creating build\lib.win-amd64-2.7\hdbscan
  copying hdbscan\hdbscan_.py -> build\lib.win-amd64-2.7\hdbscan
  copying hdbscan\plots.py -> build\lib.win-amd64-2.7\hdbscan
  copying hdbscan\robust_single_linkage_.py -> build\lib.win-amd64-2.7\hdbscan
  copying hdbscan\__init__.py -> build\lib.win-amd64-2.7\hdbscan
  running build_ext
  skipping 'hdbscan\_hdbscan_tree.c' Cython extension (up-to-date)
  building 'hdbscan._hdbscan_tree' extension
  creating build\temp.win-amd64-2.7
  creating build\temp.win-amd64-2.7\Release
  creating build\temp.win-amd64-2.7\Release\hdbscan
  C:\Users\user_name\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9
.0\VC\Bin\amd64\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -IC:\Anaconda2\lib\s
ite-packages\numpy\core\include -IC:\Anaconda2\include -IC:\Anaconda2\PC /Tchdbs
can\_hdbscan_tree.c /Fobuild\temp.win-amd64-2.7\Release\hdbscan\_hdbscan_tree.ob
j
  _hdbscan_tree.c
  c:\anaconda2\lib\site-packages\numpy\core\include\numpy\npy_1_7_deprecated_api
.h(12) : Warning Msg: Using deprecated NumPy API, disable it by #defining NPY_NO
_DEPRECATED_API NPY_1_7_API_VERSION
  hdbscan\_hdbscan_tree.c(2560) : warning C4244: 'function' : conversion from '_
_pyx_t_5numpy_intp_t' to 'long', possible loss of data
  hdbscan\_hdbscan_tree.c(2787) : warning C4244: '=' : conversion from '__pyx_t_
5numpy_intp_t' to 'int', possible loss of data
  hdbscan\_hdbscan_tree.c(2820) : warning C4244: '+=' : conversion from 'Py_ssiz
e_t' to 'long', possible loss of data
  hdbscan\_hdbscan_tree.c(2831) : warning C4244: '+=' : conversion from 'Py_ssiz
e_t' to 'long', possible loss of data
  hdbscan\_hdbscan_tree.c(2842) : warning C4244: '+=' : conversion from 'Py_ssiz
e_t' to 'long', possible loss of data
  hdbscan\_hdbscan_tree.c(2854) : warning C4244: '+=' : conversion from 'Py_ssiz
e_t' to 'long', possible loss of data
  hdbscan\_hdbscan_tree.c(2899) : warning C4244: '=' : conversion from '__pyx_t_
5numpy_intp_t' to 'int', possible loss of data
  hdbscan\_hdbscan_tree.c(2940) : warning C4244: '=' : conversion from '__pyx_t_
5numpy_intp_t' to 'int', possible loss of data
  hdbscan\_hdbscan_tree.c(4122) : warning C4244: 'function' : conversion from '_
_pyx_t_5numpy_intp_t' to 'long', possible loss of data
  hdbscan\_hdbscan_tree.c(5242) : warning C4244: 'function' : conversion from '_
_pyx_t_5numpy_intp_t' to 'long', possible loss of data
  hdbscan\_hdbscan_tree.c(6278) : warning C4244: 'function' : conversion from '_
_pyx_t_5numpy_intp_t' to 'long', possible loss of data
  hdbscan\_hdbscan_tree.c(6351) : warning C4244: 'function' : conversion from '_
_pyx_t_5numpy_intp_t' to 'long', possible loss of data
  hdbscan\_hdbscan_tree.c(21049) : error C2275: 'PyGILState_STATE' : illegal use
 of this type as an expression
          C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState_S
TATE'
  hdbscan\_hdbscan_tree.c(21049) : error C2146: syntax error : missing ';' befor
e identifier '__pyx_gilstate_save'
  hdbscan\_hdbscan_tree.c(21049) : error C2065: '__pyx_gilstate_save' : undeclar
ed identifier
  hdbscan\_hdbscan_tree.c(21120) : error C2065: '__pyx_gilstate_save' : undeclar
ed identifier
  hdbscan\_hdbscan_tree.c(21146) : error C2275: 'PyGILState_STATE' : illegal use
 of this type as an expression
          C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState_S
TATE'
  hdbscan\_hdbscan_tree.c(21146) : error C2146: syntax error : missing ';' befor
e identifier '__pyx_gilstate_save'
  hdbscan\_hdbscan_tree.c(21146) : error C2065: '__pyx_gilstate_save' : undeclar
ed identifier
  hdbscan\_hdbscan_tree.c(21219) : error C2065: '__pyx_gilstate_save' : undeclar
ed identifier
  hdbscan\_hdbscan_tree.c(21246) : error C2275: 'PyGILState_STATE' : illegal use
 of this type as an expression
          C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState_S
TATE'
  hdbscan\_hdbscan_tree.c(21246) : error C2146: syntax error : missing ';' befor
e identifier '__pyx_gilstate_save'
  hdbscan\_hdbscan_tree.c(21246) : error C2065: '__pyx_gilstate_save' : undeclar
ed identifier
  hdbscan\_hdbscan_tree.c(21336) : error C2065: '__pyx_gilstate_save' : undeclar
ed identifier
  hdbscan\_hdbscan_tree.c(22003) : error C2275: 'PyGILState_STATE' : illegal use
 of this type as an expression
          C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState_S
TATE'
  hdbscan\_hdbscan_tree.c(22003) : error C2146: syntax error : missing ';' befor
e identifier '__pyx_gilstate_save'
  hdbscan\_hdbscan_tree.c(22003) : error C2065: '__pyx_gilstate_save' : undeclar
ed identifier
  hdbscan\_hdbscan_tree.c(22029) : error C2065: '__pyx_gilstate_save' : undeclar
ed identifier
  error: command 'C:\\Users\\user_name\\AppData\\Local\\Programs\\Common\\Microsoft\
\Visual C++ for Python\\9.0\\VC\\Bin\\amd64\\cl.exe' failed with exit status 2

  ----------------------------------------
  Failed building wheel for hdbscan
Failed to build hdbscan
Installing collected packages: hdbscan
  Running setup.py install for hdbscan
    Complete output from command C:\Anaconda2\python.exe -c "import setuptools,
tokenize;__file__='c:\\users\\user_name\\appdata\\local\\temp\\pip-build-yf7q3s\\hdb
scan\\setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().re
place('\r\n', '\n'), __file__, 'exec'))" install --record c:\users\user_name\appdata
\local\temp\pip-a6ueto-record\install-record.txt --single-version-externally-man
aged --compile:
    running install
    running build
    running build_py
    running build_ext
    skipping 'hdbscan\_hdbscan_tree.c' Cython extension (up-to-date)
    building 'hdbscan._hdbscan_tree' extension
    C:\Users\user_name\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python
\9.0\VC\Bin\amd64\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -IC:\Anaconda2\lib
\site-packages\numpy\core\include -IC:\Anaconda2\include -IC:\Anaconda2\PC /Tchd
bscan\_hdbscan_tree.c /Fobuild\temp.win-amd64-2.7\Release\hdbscan\_hdbscan_tree.
obj
    _hdbscan_tree.c
    c:\anaconda2\lib\site-packages\numpy\core\include\numpy\npy_1_7_deprecated_a
pi.h(12) : Warning Msg: Using deprecated NumPy API, disable it by #defining NPY_
NO_DEPRECATED_API NPY_1_7_API_VERSION
    hdbscan\_hdbscan_tree.c(2560) : warning C4244: 'function' : conversion from
'__pyx_t_5numpy_intp_t' to 'long', possible loss of data
    hdbscan\_hdbscan_tree.c(2787) : warning C4244: '=' : conversion from '__pyx_
t_5numpy_intp_t' to 'int', possible loss of data
    hdbscan\_hdbscan_tree.c(2820) : warning C4244: '+=' : conversion from 'Py_ss
ize_t' to 'long', possible loss of data
    hdbscan\_hdbscan_tree.c(2831) : warning C4244: '+=' : conversion from 'Py_ss
ize_t' to 'long', possible loss of data
    hdbscan\_hdbscan_tree.c(2842) : warning C4244: '+=' : conversion from 'Py_ss
ize_t' to 'long', possible loss of data
    hdbscan\_hdbscan_tree.c(2854) : warning C4244: '+=' : conversion from 'Py_ss
ize_t' to 'long', possible loss of data
    hdbscan\_hdbscan_tree.c(2899) : warning C4244: '=' : conversion from '__pyx_
t_5numpy_intp_t' to 'int', possible loss of data
    hdbscan\_hdbscan_tree.c(2940) : warning C4244: '=' : conversion from '__pyx_
t_5numpy_intp_t' to 'int', possible loss of data
    hdbscan\_hdbscan_tree.c(4122) : warning C4244: 'function' : conversion from
'__pyx_t_5numpy_intp_t' to 'long', possible loss of data
    hdbscan\_hdbscan_tree.c(5242) : warning C4244: 'function' : conversion from
'__pyx_t_5numpy_intp_t' to 'long', possible loss of data
    hdbscan\_hdbscan_tree.c(6278) : warning C4244: 'function' : conversion from
'__pyx_t_5numpy_intp_t' to 'long', possible loss of data
    hdbscan\_hdbscan_tree.c(6351) : warning C4244: 'function' : conversion from
'__pyx_t_5numpy_intp_t' to 'long', possible loss of data
    hdbscan\_hdbscan_tree.c(21049) : error C2275: 'PyGILState_STATE' : illegal u
se of this type as an expression
            C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState
_STATE'
    hdbscan\_hdbscan_tree.c(21049) : error C2146: syntax error : missing ';' bef
ore identifier '__pyx_gilstate_save'
    hdbscan\_hdbscan_tree.c(21049) : error C2065: '__pyx_gilstate_save' : undecl
ared identifier
    hdbscan\_hdbscan_tree.c(21120) : error C2065: '__pyx_gilstate_save' : undecl
ared identifier
    hdbscan\_hdbscan_tree.c(21146) : error C2275: 'PyGILState_STATE' : illegal u
se of this type as an expression
            C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState
_STATE'
    hdbscan\_hdbscan_tree.c(21146) : error C2146: syntax error : missing ';' bef
ore identifier '__pyx_gilstate_save'
    hdbscan\_hdbscan_tree.c(21146) : error C2065: '__pyx_gilstate_save' : undecl
ared identifier
    hdbscan\_hdbscan_tree.c(21219) : error C2065: '__pyx_gilstate_save' : undecl
ared identifier
    hdbscan\_hdbscan_tree.c(21246) : error C2275: 'PyGILState_STATE' : illegal u
se of this type as an expression
            C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState
_STATE'
    hdbscan\_hdbscan_tree.c(21246) : error C2146: syntax error : missing ';' bef
ore identifier '__pyx_gilstate_save'
    hdbscan\_hdbscan_tree.c(21246) : error C2065: '__pyx_gilstate_save' : undecl
ared identifier
    hdbscan\_hdbscan_tree.c(21336) : error C2065: '__pyx_gilstate_save' : undecl
ared identifier
    hdbscan\_hdbscan_tree.c(22003) : error C2275: 'PyGILState_STATE' : illegal u
se of this type as an expression
            C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState
_STATE'
    hdbscan\_hdbscan_tree.c(22003) : error C2146: syntax error : missing ';' bef
ore identifier '__pyx_gilstate_save'
    hdbscan\_hdbscan_tree.c(22003) : error C2065: '__pyx_gilstate_save' : undecl
ared identifier
    hdbscan\_hdbscan_tree.c(22029) : error C2065: '__pyx_gilstate_save' : undecl
ared identifier
    error: command 'C:\\Users\\user_name\\AppData\\Local\\Programs\\Common\\Microsof
t\\Visual C++ for Python\\9.0\\VC\\Bin\\amd64\\cl.exe' failed with exit status 2

    ----------------------------------------
Command "C:\Anaconda2\python.exe -c "import setuptools, tokenize;__file__='c:\\u
sers\\user_name\\appdata\\local\\temp\\pip-build-yf7q3s\\hdbscan\\setup.py';exec(com
pile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __f
ile__, 'exec'))" install --record c:\users\user_name\appdata\local\temp\pip-a6ueto-r
ecord\install-record.txt --single-version-externally-managed --compile" failed w
ith error code 1 in c:\users\user_name\appdata\local\temp\pip-build-yf7q3s\hdbscan

and in case of manual install:

F:\downloads\hdbscan-master\hdbscan-master>python setup.py install
running install
running bdist_egg
running egg_info
writing requirements to hdbscan.egg-info\requires.txt
writing hdbscan.egg-info\PKG-INFO
writing top-level names to hdbscan.egg-info\top_level.txt
writing dependency_links to hdbscan.egg-info\dependency_links.txt
reading manifest file 'hdbscan.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'hdbscan.egg-info\SOURCES.txt'
installing library code to build\bdist.win-amd64\egg
running install_lib
running build_py
creating build\lib.win-amd64-2.7
creating build\lib.win-amd64-2.7\hdbscan
copying hdbscan\hdbscan_.py -> build\lib.win-amd64-2.7\hdbscan
copying hdbscan\plots.py -> build\lib.win-amd64-2.7\hdbscan
copying hdbscan\robust_single_linkage_.py -> build\lib.win-amd64-2.7\hdbscan
copying hdbscan\__init__.py -> build\lib.win-amd64-2.7\hdbscan
running build_ext
skipping 'hdbscan\_hdbscan_tree.c' Cython extension (up-to-date)
building 'hdbscan._hdbscan_tree' extension
creating build\temp.win-amd64-2.7
creating build\temp.win-amd64-2.7\Release
creating build\temp.win-amd64-2.7\Release\hdbscan
C:\Users\Intel\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0
\VC\Bin\amd64\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -IC:\Anaconda2\lib\sit
e-packages\numpy\core\include -IC:\Anaconda2\include -IC:\Anaconda2\PC /Tchdbsca
n\_hdbscan_tree.c /Fobuild\temp.win-amd64-2.7\Release\hdbscan\_hdbscan_tree.obj
_hdbscan_tree.c
c:\anaconda2\lib\site-packages\numpy\core\include\numpy\npy_1_7_deprecated_api.h
(12) : Warning Msg: Using deprecated NumPy API, disable it by #defining NPY_NO_D
EPRECATED_API NPY_1_7_API_VERSION
hdbscan\_hdbscan_tree.c(2560) : warning C4244: 'function' : conversion from '__p
yx_t_5numpy_intp_t' to 'long', possible loss of data
hdbscan\_hdbscan_tree.c(2787) : warning C4244: '=' : conversion from '__pyx_t_5n
umpy_intp_t' to 'int', possible loss of data
hdbscan\_hdbscan_tree.c(2820) : warning C4244: '+=' : conversion from 'Py_ssize_
t' to 'long', possible loss of data
hdbscan\_hdbscan_tree.c(2831) : warning C4244: '+=' : conversion from 'Py_ssize_
t' to 'long', possible loss of data
hdbscan\_hdbscan_tree.c(2842) : warning C4244: '+=' : conversion from 'Py_ssize_
t' to 'long', possible loss of data
hdbscan\_hdbscan_tree.c(2854) : warning C4244: '+=' : conversion from 'Py_ssize_
t' to 'long', possible loss of data
hdbscan\_hdbscan_tree.c(2899) : warning C4244: '=' : conversion from '__pyx_t_5n
umpy_intp_t' to 'int', possible loss of data
hdbscan\_hdbscan_tree.c(2940) : warning C4244: '=' : conversion from '__pyx_t_5n
umpy_intp_t' to 'int', possible loss of data
hdbscan\_hdbscan_tree.c(4122) : warning C4244: 'function' : conversion from '__p
yx_t_5numpy_intp_t' to 'long', possible loss of data
hdbscan\_hdbscan_tree.c(5242) : warning C4244: 'function' : conversion from '__p
yx_t_5numpy_intp_t' to 'long', possible loss of data
hdbscan\_hdbscan_tree.c(6278) : warning C4244: 'function' : conversion from '__p
yx_t_5numpy_intp_t' to 'long', possible loss of data
hdbscan\_hdbscan_tree.c(6351) : warning C4244: 'function' : conversion from '__p
yx_t_5numpy_intp_t' to 'long', possible loss of data
hdbscan\_hdbscan_tree.c(21049) : error C2275: 'PyGILState_STATE' : illegal use o
f this type as an expression
        C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState_STA
TE'
hdbscan\_hdbscan_tree.c(21049) : error C2146: syntax error : missing ';' before
identifier '__pyx_gilstate_save'
hdbscan\_hdbscan_tree.c(21049) : error C2065: '__pyx_gilstate_save' : undeclared
 identifier
hdbscan\_hdbscan_tree.c(21120) : error C2065: '__pyx_gilstate_save' : undeclared
 identifier
hdbscan\_hdbscan_tree.c(21146) : error C2275: 'PyGILState_STATE' : illegal use o
f this type as an expression
        C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState_STA
TE'
hdbscan\_hdbscan_tree.c(21146) : error C2146: syntax error : missing ';' before
identifier '__pyx_gilstate_save'
hdbscan\_hdbscan_tree.c(21146) : error C2065: '__pyx_gilstate_save' : undeclared
 identifier
hdbscan\_hdbscan_tree.c(21219) : error C2065: '__pyx_gilstate_save' : undeclared
 identifier
hdbscan\_hdbscan_tree.c(21246) : error C2275: 'PyGILState_STATE' : illegal use o
f this type as an expression
        C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState_STA
TE'
hdbscan\_hdbscan_tree.c(21246) : error C2146: syntax error : missing ';' before
identifier '__pyx_gilstate_save'
hdbscan\_hdbscan_tree.c(21246) : error C2065: '__pyx_gilstate_save' : undeclared
 identifier
hdbscan\_hdbscan_tree.c(21336) : error C2065: '__pyx_gilstate_save' : undeclared
 identifier
hdbscan\_hdbscan_tree.c(22003) : error C2275: 'PyGILState_STATE' : illegal use o
f this type as an expression
        C:\Anaconda2\include\pystate.h(137) : see declaration of 'PyGILState_STA
TE'
hdbscan\_hdbscan_tree.c(22003) : error C2146: syntax error : missing ';' before
identifier '__pyx_gilstate_save'
hdbscan\_hdbscan_tree.c(22003) : error C2065: '__pyx_gilstate_save' : undeclared
 identifier
hdbscan\_hdbscan_tree.c(22029) : error C2065: '__pyx_gilstate_save' : undeclared
 identifier
error: command 'C:\\Users\\user_name\\AppData\\Local\\Programs\\Common\\Microsoft\\V
isual C++ for Python\\9.0\\VC\\Bin\\amd64\\cl.exe' failed with exit status 2
lmcinnes commented 8 years ago

That looks bad. I'm really not quite sure what has gone wrong here -- I have little experience on installing on windows. I reccommend using the manual install and deleting all the .c files in the hdbscan directory; i.e.

hdbscan/_hdbscan_tree.c hdbscan/_hdbscan_boruvka.x hdbscan/_hdbscan_linkage.c hdbscan/_hdbscan_reachability.c

And then try building again; there seems to be something with the C files that Windows is not liking.

s0j0urn commented 8 years ago

I tried on Ubuntu. It worked fine on Ubuntu

lmcinnes commented 8 years ago

I'm glad you managed to get it working. Sorry about the issues on Windows (windows seems to be my largest source of problems). I'll leave the issue open for now; some of the other people on the team have access to windows systems they can test on, and we'll try to get the windows problems resolved. Thanks for the issue report!

h-krishna commented 8 years ago

lmcinnes, Unfortunately I am getting the same error messages as above in win7 64 env. Regards Hari

lmcinnes commented 8 years ago

At this point I have to ask for help from anyone with Windows experience/knowledge. It seems that everything works fine on Linux and MacOS X (the environments I have access to), and at least sometimes works on Windows ... any help anyone with experience compiling on windows would be greatly appreciated!

lmcinnes commented 8 years ago

As a side note; is this a recent pip install that is failing? I attempted to add a wheel package for win64 just the other day as supplied from a helpful individual -- perhaps that has caused the problem?

h-krishna commented 8 years ago

It is indeed the pip. But I think the problem was due to numpy that required cython for versions 1.10 and above. Installed quite fine when Cython was installed.Thanks for the quick reply

tejas-schdv commented 7 years ago

Can confirm, had to install cython first then install hdbscan on ubuntu 16.04