uncharted-distil / distil-auto-ml

Distil Automated Machine Learning Server
Apache License 2.0
2 stars 1 forks source link

SEMI_1053_jm1 pipeline failing #99

Closed cdbethune closed 5 years ago

cdbethune commented 5 years ago

The other semi-supervised pipelines have numeric labels, which the HDBSCAN primitive can work with. jm1 has labels that are True or False, which the HDBSCAN step can't hande. This can be resolved by encoding the labels prior to the clustering step, although this will be a bit more complicated due to the fact that the clusterer is a dataset -> dataframe primitive, and anything running before the clustering step will have to be wrapped in the dataset_map primitive.

cdbethune commented 5 years ago

Pipeline was modified so that it supports text labels.