soda-inria / sklearn-numba-dpex

Experimental plugin for scikit-learn to be able to run (some estimators) on Intel GPUs via numba-dpex.
BSD 3-Clause "New" or "Revised" License
15 stars 4 forks source link

`test_predict_kmeans` sklearn test can sometimes fail because of non-deterministic cluster relocation #97

Open fcharras opened 1 year ago

fcharras commented 1 year ago

Our cluster relocation function relies on a parallel argpartition function that doesn't have the same tie-breaking strategy than np.argpartition, and, besides, it chooses tie-breaks in a non-deterministic way.

It means that two consecutive KMeans.fit ran with the sklearn_numba_dpex engine, with the same seed, are not guaranteed to converge to the same list of centroids, but only to the same list of centroids up to a permutation. This is not user-friendly.

This can (rarely) cause sklearn test_predict_kmeans to fail.

This seems to be a solid argument to justify the cost of adding some synchronization in our argpartition kernels to at least ensure a deterministic tie-break strategy ?

Or maybe, sort the cluster centers after the fit in a deterministic way ?

WDYT ?

jjerphan commented 1 year ago

I think your analysis and solution are good, but is it a priority? Can we mark the test as xfail for now?