issues
search
astrolabsoftware
/
spark3D
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteorology, …
https://astrolabsoftware.github.io/spark3D/
Apache License 2.0
30
stars
16
forks
source link
Make a partitioning based on the clustering of the data
#94
Open
JulienPeloton
opened
6 years ago
JulienPeloton
commented
6 years ago
Idea 1 (fixed clustering):
load raw data
perform a k-means where k = number of partitions
repartition accordingly
Idea 2 (dynamic clustering):
Load raw data
Look dynamically for clusters in the data. Maybe we can start with some guess, and increase/decrease if data suggest.
repartition accordingly
Idea 1 (fixed clustering):
Idea 2 (dynamic clustering):