zqhZY / short_text_cnn_cluster

Implement of paper Self-Taught Convolutional Neural Networks for Short Text Clustering using Keras.
46 stars 16 forks source link

I am getting errors #5

Closed ans92 closed 3 years ago

ans92 commented 3 years ago

Hi, First of all, I want to say thanks for this very helpful code. I am a beginner and we are tried to reproduce the results of this experiment. I am trying to running this code on google colab. Due to resources limit I shorten the data set to first 200 rows. I write following lines to run this code files in google colab: !python3 /content/drive/MyDrive/IRTM_Term_Project_Sem_1/short_text_cnn_cluster-master/utils.py . !python3 /content/drive/MyDrive/IRTM_Term_Project_Sem_1/short_text_cnn_cluster-master/train_cnn.py .

When I run second command then I am getting these errors. Please help me how can I remove those errors:

/usr/local/lib/python3.6/dist-packages/sklearn/cluster/_kmeans.py:88: RuntimeWarning: divide by zero encountered in log n_local_trials = 2 + int(np.log(n_clusters)) /usr/local/lib/python3.6/dist-packages/sklearn/cluster/_kmeans.py:88: RuntimeWarning: divide by zero encountered in log n_local_trials = 2 + int(np.log(n_clusters)) /usr/local/lib/python3.6/dist-packages/sklearn/cluster/_kmeans.py:88: RuntimeWarning: divide by zero encountered in log n_local_trials = 2 + int(np.log(n_clusters)) joblib.externals.loky.process_executor._RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/joblib/externals/loky/process_executor.py", line 431, in _process_worker r = call_item() File "/usr/local/lib/python3.6/dist-packages/joblib/externals/loky/process_executor.py", line 285, in call return self.fn(*self.args, *self.kwargs) File "/usr/local/lib/python3.6/dist-packages/joblib/_parallel_backends.py", line 595, in call return self.func(args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/joblib/parallel.py", line 263, in call for func, args, kwargs in self.items] File "/usr/local/lib/python3.6/dist-packages/joblib/parallel.py", line 263, in for func, args, kwargs in self.items] File "/usr/local/lib/python3.6/dist-packages/sklearn/cluster/_kmeans.py", line 314, in _kmeans_single_elkan x_squared_norms=x_squared_norms) File "/usr/local/lib/python3.6/dist-packages/sklearn/cluster/_kmeans.py", line 626, in _init_centroids x_squared_norms=x_squared_norms) File "/usr/local/lib/python3.6/dist-packages/sklearn/cluster/_kmeans.py", line 88, in _k_init n_local_trials = 2 + int(np.log(n_clusters)) OverflowError: cannot convert float infinity to integer """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/content/drive/MyDrive/IRTM_Term_Project_Sem_1/short_text_cnn_cluster-master/train_cnn.py", line 157, in km.fit(V) File "/usr/local/lib/python3.6/dist-packages/sklearn/cluster/_kmeans.py", line 956, in fit for seed in seeds) File "/usr/local/lib/python3.6/dist-packages/joblib/parallel.py", line 1054, in call self.retrieve() File "/usr/local/lib/python3.6/dist-packages/joblib/parallel.py", line 933, in retrieve self._output.extend(job.get(timeout=self.timeout)) File "/usr/local/lib/python3.6/dist-packages/joblib/_parallel_backends.py", line 542, in wrap_future_result return future.result(timeout=timeout) File "/usr/lib/python3.6/concurrent/futures/_base.py", line 432, in result return self.get_result() File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in get_result raise self._exception OverflowError: cannot convert float infinity to integer

ans92 commented 3 years ago

I have used data set of 20000 rows and my issue resolved. I was first trying to do it on smaller data set. Do not know what is reason behind it.