mariaderrico / DPA

The DPA package is the scikit-learn compatible implementation of the Density Peaks Advanced clustering algorithm. The algorithm provides robust and visual information about the clusters, their statistical reliability and their hierarchical organization.
BSD 3-Clause "New" or "Revised" License
27 stars 9 forks source link

Automatically set initial parameters for consecutive runs of the DPA estimator #3

Closed mariaderrico closed 3 years ago

mariaderrico commented 3 years ago

When running the DPA clustering for different values of the parameter Z, all attributes are computed each time, i.e. indices and distances of nn-neighbors, which makes the analysis computationally inefficient. The algorithm performance would improve if consecutive runs could use the computed attributes as values for the corresponding __init__ parameters, and change only Z, as the parameter of interest.

The following example shows how to manually set each __init__ parameter value to the corresponding attribute computed in the latest run of the algorithm, by using the scikit-learn set_params function:

# Initialization and first run
est = DPA.DensityPeakAdvanced(Z=1.5)
est.fit(data)
# Set each __init__ parameter value to the new computed value
est.set_params(nn_indices=est.nn_indices_)
# Run set_params for all the computed parameters 
[...]
# Set the new value of the parameter of interest Z
est.set_params(Z=1)
# Run the DPA clustering again
est.fit(data)

It would be useful to set the values of those __init__ parameters automatically, so that only the parameter Z has to be changed.