Closed bdelespierre closed 2 years ago
Are you accepting the distance as a callback function to give the user more flexibility in injecting the distance function?
If it is modified as suggested, will setDistanceFunction be added to AlgorithmInterface?
Are you accepting the distance as a callback function to give the user more flexibility in injecting the distance function?
Absolutely. So the client code can change just that and keep the rest in place. Otherwise, subtyping Algorithm class would be necessary (which is "technically" better but not very user-friendly IMMHO).
If it is modified as suggested, will setDistanceFunction be added to AlgorithmInterface?
Yes, my bad I forgot it. Here it is:
interface AlgorithmInterface
{
public function clusterize(PointCollectionInterface $points, int $nbClusters): ClusterCollectionInterface;
public function setDistanceFunction(callable $function): void;
}
An immediate candidate for this would be hamming dist :+1:
As it turns out, specifying a distance function is not enough: one also needs to provide a centroid determination function as well.
Such an implementation can be seen in GPS\Algorithm implementation, also showing the evolution of the AlgorithmInterface that reflects that necessity to implement both methods (getDistanceBetween and findCentroid.)
Furthermore, I can't think of a valid use-case of clustering points outside both GPS and Euclidean space :thinking: But nonetheless it will be now possible through subclassing of Kmeans\Algorithm.
Hence, this issue is now irrelevant.
Proposed change