Using fbe0fe6b4be4caf49d7ae5eaf44f2ea47e8be7aa, hdbscan fails for combinations of cluster_selection_method leaf and some (ridiculous) values of min_cluster_size or min_samples:
Traceback (most recent call last):
File "HDBSCAN.py", line 79, in Train trained_model = model.fit(x)
File "build/bdist.linux-x86_64/egg/hdbscan/hdbscan_.py", line 864, in fit
File "build/bdist.linux-x86_64/egg/hdbscan/hdbscan_.py", line 613, in hdbscan
File "build/bdist.linux-x86_64/egg/hdbscan/hdbscan_.py", line 110, in _tree_to_labels
File "hdbscan/_hdbscan_tree.pyx", line 610, in hdbscan._hdbscan_tree.get_clusters (hdbscan/_hdbscan_tree.c:11757)
File "hdbscan/_hdbscan_tree.pyx", line 691, in hdbscan._hdbscan_tree.get_clusters (hdbscan/_hdbscan_tree.c:11205)
File "hdbscan/_hdbscan_tree.pyx", line 607, in hdbscan._hdbscan_tree.get_cluster_tree_leaves (hdbscan/_hdbscan_tree.c:10449)
File "/usr/local/lib/python2.7/dist-packages/numpy/core/_methods.py", line 29, in _amin return umr_minimum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation minimum which has no identity Program exited with code 1
The errors do not occur with cluster_selection_method eom.
This error is reproducible with the iris dataset and the following parameter settings of hdbscan:
min_cluster_size = 70, min_samples = 500, metric = cityblock, alpha = 0.1, p = 1, algorithm = best, leaf_size = 4, approx_min_span_tree = True, gen_min_span_tree = True, cluster_selection_method = leaf, allow_single_cluster = True, match_reference_implementation = False
I am amazed that didn't fail earlier in the process to be honest -- were you fuzzing the implementation or something? I'll see if I can track down what the right way to handle this is.
Using fbe0fe6b4be4caf49d7ae5eaf44f2ea47e8be7aa, hdbscan fails for combinations of cluster_selection_method leaf and some (ridiculous) values of min_cluster_size or min_samples:
The errors do not occur with cluster_selection_method eom.
This error is reproducible with the iris dataset and the following parameter settings of hdbscan: min_cluster_size = 70, min_samples = 500, metric = cityblock, alpha = 0.1, p = 1, algorithm = best, leaf_size = 4, approx_min_span_tree = True, gen_min_span_tree = True, cluster_selection_method = leaf, allow_single_cluster = True, match_reference_implementation = False