IBMPredictiveAnalytics / K_Means_with_MLlib

SPSS Modeler Extension to execute PySpark MLlib implementation of K-Means Clustering
Apache License 2.0
2 stars 5 forks source link

Model building fails for data containing characters outside ascii range #3

Closed ghost closed 7 years ago

ghost commented 7 years ago

Should be fixed. The issues were:

(1) python scripts need to be set to work with encoding=utf-8 (2) python 2 has confusion between unicode and str (utf-8 encoded), the scripts need to be careful when comparing str and unicode objects