ArcticSnow / TopoPyScale

TopoPyScale: a Python library to perform simplistic climate downscaling at the hillslope scale
https://topopyscale.readthedocs.io
MIT License
39 stars 9 forks source link

Introduce tools to determine number of clusters #47

Closed ArcticSnow closed 1 year ago

ArcticSnow commented 1 year ago

We could add tools that help deciding what is an appropriate number of clusters to run a job. This may be simple metrics of average size of a clusters, and but also more advanced metrics such as the one presented here:

joelfiddes commented 1 year ago

absolutely - I think the elbow method would be quite straightforward as the kmean algorithm is pretty quick - repeatable. Did something similar just as an evaluation in F.7 and 9 in TopoSUB paper https://gmd.copernicus.org/articles/5/1245/2012/gmd-5-1245-2012.pdf

Now the domain size varies its a harder number to estimate a priori - before I have standard 0.25 or 0.75 deg grids. More important now to have this.

On Mon, Jan 9, 2023 at 10:25 AM Simon Filhol @.***> wrote:

We could add tools that help deciding what is an appropriate number of clusters to run a job. This may be simple metrics of average size of a clusters, and but also more advanced metrics such as the one presented here:

https://towardsdatascience.com/how-many-clusters-6b3f220f0ef5

— Reply to this email directly, view it on GitHub https://github.com/ArcticSnow/TopoPyScale/issues/47, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRO2BFP3GN5736HNYKW2O3WRPKP3ANCNFSM6AAAAAATVGIZSQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Dr Joel Fiddes-Caduff WSL Institute for Snow and Avalanche Research SLF Research Unit Snow and Permafrost

Flüelastrasse 11, CH-7260 Davos Dorf

Phone: +41 81 4170 274 | E-mail: @.*** Web : http://www.slf.ch/ueber/organisation/schnee_permafrost

krisaalstad commented 1 year ago

Just came across this preprint and maybe it would be helpful for this discussion: https://arxiv.org/abs/2212.12189

ArcticSnow commented 1 year ago

I just added a function that compute three types of scores. The WCSS, and the two others mentioned in the publication. Those were readily available in scikit-learn.

See: https://github.com/ArcticSnow/TopoPyScale/commit/f3b331eae8b02445e4d1d3f46b405bc2fa302a20#diff-a1f8f04d8bdf436ee1f0c4dbcfeebca3963a12cc97bbb8240693e62fe399ae00R126

ArcticSnow commented 1 year ago

This feature is now implemented into topo_sub and added as a functionality into topoclass.