GAA-UAM / scikit-fda

Functional Data Analysis Python package
https://fda.readthedocs.io
BSD 3-Clause "New" or "Revised" License
306 stars 58 forks source link

Why does calling `FuzzyCMeans.predict_proba()` alter the centroids of the fitted model? #626

Open vnmabus opened 3 months ago

vnmabus commented 3 months ago

Discussed in https://github.com/GAA-UAM/scikit-fda/discussions/625

Originally posted by **ensley-nexant** August 5, 2024 I'm trying to understand why I can call `.predict_proba()` multiple times on the same fitted model, passing the same function, and get different predictions. Is this the intended behavior? It looks like the centroids are being [updated](https://github.com/GAA-UAM/scikit-fda/blob/64235be4ef41cc15c58f34fb6d6a16c9d7aafd75/skfda/ml/clustering/_kmeans.py#L848) on every prediction, and I'm not sure why that would be the case. Here is an example: ```python from skfda import datasets from skfda.ml.clustering import FuzzyCMeans X, y = datasets.fetch_weather(return_X_y=True, as_frame=True) fd = X.iloc[:, 0].values.coordinates[0] kmeans = FuzzyCMeans(n_clusters=4, random_state=2) kmeans.fit(fd) print("Initial centroids") print(kmeans.cluster_centers_.data_matrix[0][:4]) for i in range(1, 5): print("\nPrediction", i) print(kmeans.predict_proba(fd[1])[0]) print("Centroids (head):") print(kmeans.cluster_centers_.data_matrix[0][:4]) ``` Output: ``` Initial centroids [[-7.48331886] [-7.69572714] [-8.33347371] [-8.2457692 ]] Prediction 1 [0.81649987 0.04951254 0.12522585 0.00876174] Centroids (head): [[-4.4] [-4.2] [-5.3] [-5.4]] Prediction 2 [0.2 0.2 0.2 0.4] Centroids (head): [[-4.4] [-4.2] [-5.3] [-5.4]] Prediction 3 [0.28571429 0.28571429 0.28571429 0.14285714] Centroids (head): [[-4.4] [-4.2] [-5.3] [-5.4]] Prediction 4 [0.25 0.25 0.25 0.25] Centroids (head): [[-4.4] [-4.2] [-5.3] [-5.4]] ```