scikit-learn-contrib / hdbscan

A high performance implementation of HDBSCAN clustering.
http://hdbscan.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
2.77k stars 497 forks source link

AttributeError: flags not found #220

Open moinnadeem opened 6 years ago

moinnadeem commented 6 years ago

When running hdbscan with prims_kdtree, I get the following traceback. Please note my matrix is sparse.

Traceback (most recent call last):
  File "Clustering - West Coast - Teachers.py", line 125, in <module>
    df['clusters'] = model.fit_predict(X_test)
  File "/home/moinnadeem/.virtualenvs/quizlet/lib/python2.7/site-packages/hdbscan/hdbscan_.py", line 873, in fit_predict
    self.fit(X)
  File "/home/moinnadeem/.virtualenvs/quizlet/lib/python2.7/site-packages/hdbscan/hdbscan_.py", line 851, in fit
    self._min_spanning_tree) = hdbscan(X, **kwargs)
  File "/home/moinnadeem/.virtualenvs/quizlet/lib/python2.7/site-packages/hdbscan/hdbscan_.py", line 518, in hdbscan
    gen_min_span_tree, **kwargs)
  File "/home/moinnadeem/.virtualenvs/quizlet/lib/python2.7/site-packages/sklearn/externals/joblib/memory.py", line 362, in __call__
    return self.func(*args, **kwargs)
  File "/home/moinnadeem/.virtualenvs/quizlet/lib/python2.7/site-packages/hdbscan/hdbscan_.py", line 178, in _hdbscan_prims_kdtree
    if not X.flags['C_CONTIGUOUS']:
  File "/home/moinnadeem/.virtualenvs/quizlet/lib/python2.7/site-packages/scipy/sparse/base.py", line 686, in __getattr__
    raise AttributeError(attr + " not found")
AttributeError: flags not found
lmcinnes commented 6 years ago

Unfortunately sparse matrices that aren't distance matrices only support the generic algorithm. Sorry. You could try dimension reduction (PCA/truncatedSVD might be fine) down to a dense matrix of manageable dimension (50-100) or UMAP down to a small dimension (5-10). Otherwise I'm afraid there isn't too much that is available without a whole new codepath and sparse distance functions getting written.

In the meantime I have added a check so you at least get an informative error message in this case. Thanks for the report! Sorry I couldn't be more help.