ml-lib / CodeLib

Code library for common machine learning algorithms
BSD 3-Clause "New" or "Revised" License
2 stars 1 forks source link

[Feature]: Clustering: Optimal k #1

Closed bdiptesh closed 2 years ago

bdiptesh commented 2 years ago

Is your feature request related to a problem? Please describe.

A clustering module to cluster any given data (categorical/continuos/ordinal) and returns optimal clustering solution.

Describe the solution you'd like

Compute optimal clustering solution using gap-statistic.

Methods:

  1. First SE
  2. Maximum Gap

Expected input(s)

df: pandas.DataFrame
x_var: List[str]
max_cluster: int
method: Union[str]

Expected output(s)

opt_k

Additional context

No response

Acceptance criteria

Integration tests:

Version

v0.4.0