bioinfocao / pysapc

python package for Sparse Affinity Propagation (SAP) Clustering method
BSD 3-Clause "New" or "Revised" License
15 stars 7 forks source link

expected 'DTYPEint_t' but got 'long long' #4

Open Kiord opened 1 year ago

Kiord commented 1 year ago

Hello,

I installed the lib from the sources using python setup.py install and ran the tests, but got an error :

>>> from pysapc import tests
>>> tests.testDense()
2023-04-05 10:48:52.494901, start SKlearn Affinity Propagation
<path_to_my_conda_env>\lib\site-packages\sklearn\utils\validation.py:723: FutureWarning: np.matrix usage is deprecated in 1.0 and will raise a TypeError in 1.2. Please convert to a numpy array with np.asarray. For more information see: https://numpy.org/doc/stable/reference/generated/numpy.matrix.html
  warnings.warn(
Converged after 112 iterations.
2023-04-05 10:48:55.330916, start Fast Sparse Affinity Propagation Cluster
2023-04-05 10:48:55.386707, Starting Sparse Affinity Propagation
2023-04-05 10:48:55.538004, Starting sparseMatrixPrepare.rmSingleSamples
2023-04-05 10:48:55.578824, Starting sparseMatrixPrepare.preCompute
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<path_to_my_conda_env>\Lib\site-packages\pysapc-1.2.0-py3.8-win-amd64.egg\pysapc\tests\test_sap.py", line 106, in testDense
    exemplars_similarity=clusterSimilarityWithSklearnAPC(data_file=dense_similarity_matrix_file,damping=0.9,max_iter=200,convergence_iter=15,preference='min')
  File "<path_to_my_conda_env>\Lib\site-packages\pysapc-1.2.0-py3.8-win-amd64.egg\pysapc\tests\test_sap.py", line 66, in clusterSimilarityWithSklearnAPC
    sap_exemplars=sap.fit_predict(simi_mat_dense)
  File "<path_to_my_conda_env>\Lib\site-packages\pysapc-1.2.0-py3.8-win-amd64.egg\pysapc\SparseAPCluster.py", line 447, in fit_predict
    self.exemplars_=sparseAffinityPropagation(row_array,col_array,data_array,\
  File "<path_to_my_conda_env>\Lib\site-packages\pysapc-1.2.0-py3.8-win-amd64.egg\pysapc\SparseAPCluster.py", line 198, in sparseAffinityPropagation
    sparseMatrixPrepare.preCompute(rowBased_row_array,rowBased_col_array,S_rowBased_data_array)
  File "<path_to_my_conda_env>\Lib\site-packages\pysapc-1.2.0-py3.8-win-amd64.egg\pysapc\sparseMatrixPrepare.py", line 111, in preCompute
    colBased_row_array=sparseAP_cy.npArrRearrange_int_para(rowBased_row_array,row_to_col_ind_arr)
  File "sparseAP_cy.pyx", line 222, in sparseAP_cy.npArrRearrange_int_para
    cpdef npArrRearrange_int_para(DTYPEint_t[::1] arr,DTYPEint_t[::1] ind):
ValueError: Buffer dtype mismatch, expected 'DTYPEint_t' but got 'long long'
>>> tests.testSparse()
2023-04-05 10:48:58.443679, start Sparse Affinity Propagation with dense matrix
2023-04-05 10:48:58.487899, Starting Sparse Affinity Propagation
2023-04-05 10:48:58.747755, Starting sparseMatrixPrepare.rmSingleSamples
2023-04-05 10:48:58.795544, Starting sparseMatrixPrepare.preCompute
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<path_to_my_conda_env>\Lib\site-packages\pysapc-1.2.0-py3.8-win-amd64.egg\pysapc\tests\test_sap.py", line 117, in testSparse
    exemplars_similarity=clusterSimilarityWithDenseMatrix(data_file=dense_similarity_matrix_file,cutoff=cutoff,damping=0.9,max_iter=500,convergence_iter=15,preference='min')
  File "<path_to_my_conda_env>\Lib\site-packages\pysapc-1.2.0-py3.8-win-amd64.egg\pysapc\tests\test_sap.py", line 90, in clusterSimilarityWithDenseMatrix
    sap_dense_exemplars=sap_dense.fit_predict(simi_mat_dense)
  File "<path_to_my_conda_env>\Lib\site-packages\pysapc-1.2.0-py3.8-win-amd64.egg\pysapc\SparseAPCluster.py", line 447, in fit_predict
    self.exemplars_=sparseAffinityPropagation(row_array,col_array,data_array,\
  File "<path_to_my_conda_env>\Lib\site-packages\pysapc-1.2.0-py3.8-win-amd64.egg\pysapc\SparseAPCluster.py", line 198, in sparseAffinityPropagation
    sparseMatrixPrepare.preCompute(rowBased_row_array,rowBased_col_array,S_rowBased_data_array)
  File "<path_to_my_conda_env>\Lib\site-packages\pysapc-1.2.0-py3.8-win-amd64.egg\pysapc\sparseMatrixPrepare.py", line 111, in preCompute
    colBased_row_array=sparseAP_cy.npArrRearrange_int_para(rowBased_row_array,row_to_col_ind_arr)
  File "sparseAP_cy.pyx", line 222, in sparseAP_cy.npArrRearrange_int_para
    cpdef npArrRearrange_int_para(DTYPEint_t[::1] arr,DTYPEint_t[::1] ind):
ValueError: Buffer dtype mismatch, expected 'DTYPEint_t' but got 'long long'

relevant packages versions :

python==3.8.1
numpy==1.21.5
scipy==1.8.0
pandas==1.4.2
Cython==0.29.28
Kiord commented 1 year ago

It appears that np.lexsort outputs a int64 array even with two int32 arrays as input.

Quick fixes :

change line 110 in sparseMatrixPrepare.py

row_to_col_ind_arr=np.lexsort((rowBased_row_array,rowBased_col_array))

to

row_to_col_ind_arr=np.lexsort((rowBased_row_array,rowBased_col_array)).astype(np.int32)

Also, sklearn's AffinityPropagation.cluster_centers_indices_ is int64, so :

change line 63 in test_sap.py

sk_exemplars=np.asarray([cluster_centers_indices[i] for i in labels])

to

sk_exemplars=np.asarray([cluster_centers_indices[i] for i in labels], dtype=np.int32)