I am getting a TypeError in the current version of this module. Whether it appears or not depends on the number of clusters I request. On the same dataset, with 2 clusters requested I never see this error, with 4 clusters I see it sometimes, with 10 I always see it.
File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 430, in fit
self.n_clusters,self.max_dist_iter)
File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 271, in k_modes_partitioned
clusters = check_for_empty_cluster(clusters, rdd)
File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 315, in check_for_empty_cluster
partition_sizes = cluster_sizes[n_clusters(partition_index):n_clusters(partition_index+1)]
TypeError: slice indices must be integers or None or have an index method
Hello,
I am getting a TypeError in the current version of this module. Whether it appears or not depends on the number of clusters I request. On the same dataset, with 2 clusters requested I never see this error, with 4 clusters I see it sometimes, with 10 I always see it.
File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 430, in fit self.n_clusters,self.max_dist_iter) File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 271, in k_modes_partitioned clusters = check_for_empty_cluster(clusters, rdd) File "/usr/local/lib/python3.5/dist-packages/pyspark_kmodes/pyspark_kmodes.py", line 315, in check_for_empty_cluster partition_sizes = cluster_sizes[n_clusters(partition_index):n_clusters(partition_index+1)] TypeError: slice indices must be integers or None or have an index method
This is Spark 2.2. Any ideas will be appreciated.