Reason (Why?)
This integration is aimed at implementing Unsupervised Learning like anomaly detection and clustering.
Solution (What?)
[x] #476
autocluster: It is not maintained anymore, own fork necessary.
pycaret: Project is well maintained and documented. Adapter already exists.
H2O: Project is well maintained and Documented. Adapter already exists.
[x] Check how the solutions could be implemented. How can unsupervised learning ontology be implemented?
pycaret: Also for supervised learning an Adapter already Exists. Unsupervised could be directly implemented in this adapter. The Ontology already is prepared for unsupervised learning (as seen in ML_area, ML_Task, ML_Approach, AutoMLSolution)
Except for the specific parameters there will probably a update needed in pycaret Config Parameter
[x] Identify which hyperparameters are relevant and global for the model library.
num_clusters = x
ML_model = [
‘kmeans’ - K-Means Clustering
‘ap’ - Affinity Propagation
‘meanshift’ - Mean shift Clustering
‘sc’ - Spectral Clustering
‘hclust’ - Agglomerative Clustering
‘dbscan’ - Density-Based Spatial Clustering
‘optics’ - OPTICS Clustering
‘birch’ - Birch Clustering
‘kmodes’ - K-Modes Clustering
] models
Problem - Solutions are not Auto selected. You have to select. Default: k-means
The metrics for clustering are not really good for comparing the result of two different algorithms as they prefer certain algorithms or do not fit for certain data sets. Therefore we decided to safe the model of every approach and offer the possibility to safe multiple models for one adapter. Before it was only supported to create one model for each adapter per training.
Reason (Why?) This integration is aimed at implementing Unsupervised Learning like anomaly detection and clustering.
Solution (What?)
[x] #476 autocluster: It is not maintained anymore, own fork necessary. pycaret: Project is well maintained and documented. Adapter already exists. H2O: Project is well maintained and Documented. Adapter already exists.
[x] Check how the solutions could be implemented. How can unsupervised learning ontology be implemented? pycaret: Also for supervised learning an Adapter already Exists. Unsupervised could be directly implemented in this adapter. The Ontology already is prepared for unsupervised learning (as seen in ML_area, ML_Task, ML_Approach, AutoMLSolution) Except for the specific parameters there will probably a update needed in pycaret Config Parameter
[x] Identify which hyperparameters are relevant and global for the model library.
num_clusters = x
ML_model = [ ‘kmeans’ - K-Means Clustering
‘ap’ - Affinity Propagation
‘meanshift’ - Mean shift Clustering
‘sc’ - Spectral Clustering
‘hclust’ - Agglomerative Clustering
‘dbscan’ - Density-Based Spatial Clustering
‘optics’ - OPTICS Clustering
‘birch’ - Birch Clustering
‘kmodes’ - K-Modes Clustering ] models Problem - Solutions are not Auto selected. You have to select. Default: k-means
Parameters : pycaret Parameters
[ ] #506
[ ] Check results of Anomaly Detection for pycaret
[ ] Remove Target from frontend selection (First Step at the Table view page)
[ ] Scoring for cluster results analyze model pycaret
The metrics for clustering are not really good for comparing the result of two different algorithms as they prefer certain algorithms or do not fit for certain data sets. Therefore we decided to safe the model of every approach and offer the possibility to safe multiple models for one adapter. Before it was only supported to create one model for each adapter per training.