elki-project / elki

ELKI Data Mining Toolkit
https://elki-project.github.io/
GNU Affero General Public License v3.0
781 stars 321 forks source link

`ClusterOrder` vs `Clustering<OPTICSModel>` in OPTICS #106

Closed DiTo97 closed 1 year ago

DiTo97 commented 1 year ago

Why does the abstract OPTICS base class (AbstractOPTICS) and most OPTICS implementations (e.g., FastOPTICS) available in the library return a ClusterOrder object, whereas the OPTICSXi implementation returns a Clustering<OPTICSModel> object?

IIUC, the former is just the reachability ordering that OPTICS generates, whereas the latter applies the Xi algorithm and predecessor correction to extract clusters (and noise) from said ordering. Then, how can I compare the two output modalities?

N.B.: When I say compare, I don't mean an algorithm to compare two clustering ouputs (i.e., Rand measure), but how to programmatically compare the two output modalities given those different object types in the code.

kno10 commented 1 year ago

You can use OPTICSXi with FastOPTICS, too.

-algorithm clustering.optics.OPTICSXi -opticsxi.algorithm FastOPTICS

So either you apply the Xi extraction both times, or neither.

OPTICSHeap/OPTICSList are two implementations of the standard OPTICS algorithm with different data structures (heap vs. list). The cluster order results of these two may differ due to equal reachability distances.

DiTo97 commented 1 year ago

Thank you @kno10,

This is exactly the kind of information that I was looking for. I am closing the issue!