Ball k-means algorithms is described in detail in https://ieeexplore.ieee.org/document/9139397.
the implementation of the ball k-means algorithm of the C++ version can be found in the "C++Version" file.
the implementation of the ball k-means algorithm of the Python version can be found in the "PythonVersion" file.
All data used in the paper is in the compressed file "data+centers(1).zip".
the implementations of the ball k-means algorithm are "ball_k_means_Xf.cpp"/"ball_k_means_Xf.py" and "ball_k_means_Xd.cpp"/"ball_k_means_Xd.py", which are code for "float" and "double" versions respectively.
the param "isRing" is used to switch the ring version and the no ring version of the algorithm.
According to our experience, the "Xd" version can get more accurate results but the running time is slightly slower than "Xf"; the "Xf" version can reach the fastest running time, but low accuracy may result in many decimal places of data .
C++ compiler supporting C++11
Linux operating system or Windows operating system
Eigen 3 template library
BLAS implementation, we recommend this one: http://www.openblas.net/
Intel MKL implementation, we recommend this one: https://software.intel.com/en-us/mkl
Eigen 3: In order to use Eigen, you just need to download and extract Eigen's source code: http://eigen.tuxfamily.org/index.php?title=Main_Page#Download
ball_k_means_noRingVersion.cpp and ball_k_means_RingVersion.cpp both can be executed directly, only need to import Eigen library.
dataset: clustering data in Matrix format in the Eigen library.
centroids: initial center point data in matrix format in the Eigen library.
isRing: bool type, optional parameters, switch the ring version and the no ring version of the algorithm. "true" means the current algorithm is a ring version, and "false" means the current algorithm is no ring version. The default is false.
detail: bool type, optional parameters, "true" means output detailed information (including k value, distance calculation times, time, etc.), "false" means no detailed information is output. The default is false.
isRing: bool type, optional parameters, switch the ring version and the no ring version of the algorithm. "true" means the current algorithm is a ring version, and "false" means the current algorithm is no ring version. The default is false.
detail: bool type, optional parameters, "true" means output detailed information (including k value, distance calculation times, time, etc.), "false" means no detailed information is output. The default is false.
dataset: absolute path of th csv file of clustering data.
centroids: absolute path of th csv file of initial center point data.