mrkschan / py-dbscan

dbscan for clustering written in py
http://mrkschan.blogspot.com/2010/04/given-sequence-of-increasing-values.html
11 stars 7 forks source link

First instance in cluster output is always classified as being noise #1

Open anfractuosity opened 9 years ago

anfractuosity commented 9 years ago

Hiya,

I've found a bug in your DBSCAN algorithm, where the first instance in a cluster output is always classed as NOISE.

I generated a dataset like so:

for z in range(0,5):
        r = [ 1 , 2 , 3, 4, 5, 6, 7, 8]
        dataset.append(dbscan.data(r, 10))
    i = i + 1

And found that the first output for a cluster always had a .label of data.NOISE

cheers

Chris

mrkschan commented 9 years ago

What are the inputs of your radius (KDist) and minPt (K)?

You need to have "enough" neigbour to form a cluster. See https://github.com/mrkschan/py-dbscan/blob/master/dbscan.py#L84