Closed AlanGanem closed 5 years ago
https://stackoverflow.com/questions/39452792/cannot-cast-array-data-from-dtypeo-to-dtypefloat64 some comments of answer might be helpful.
Thank you for your answer! I've casted the matrix to np.intp but now i get the following error in self._data = np.float32(data): ValueError: setting an array element with a sequence.'
https://stackoverflow.com/questions/4674473/valueerror-setting-an-array-element-with-a-sequence have you googled first?
Im trying to train a SOM with a sparse matrix (csr_matrix format) in python 3.6 . It contains arorund 500.000 rows (ad titles) and arround 60.000 columns (vocabulary), each row contains something an average of 7 nonzero values. here's the code i've written so far:
`import somoclu import pickle
g = os.path.join(os.path.dirname(r'C:\Users\ganem\Desktop\2Vintens\dados\Historical data\janeiro_2019\dados_tratados\'), 'cv_matrix_janeiro') cv_matrix = pickle.load(open(g, 'rb')) cv_matrix_test = cv_matrix.astype('float32', copy = False)[:500]
n_rows, n_columns = 20, 20 som = somoclu.Somoclu(n_columns, n_rows, compactsupport=False) som.train(cv_matrix_test)`
the cv_matrix has been saved and load with pickle from another piece of code and i've sliced the matrix just to teste the module. Turns out i keep getting the same error over and over:
`Traceback (most recent call last):
File "", line 1, in
som.train(cv_matrix_test)
File "C:\Users\ganem\AppData\Local\Programs\Python\Python36\Lib\site-packages\somoclu\train.py", line 228, in train self.umatrix)
TypeError: Cannot cast array data from dtype('O') to dtype('float32') according to the rule 'safe'`
i've already casted the csr_matrix to float32 beforehand , but i keep getting the same error.
does someone know what might be going on?
update 1: from xgdgsc's comment i then casted the csr_matrix to np.intp, which solved the 'O' type problem, but now i get the following error:
' File "C:\Users\ganem\AppData\Local\Programs\Python\Python36\Lib\site-packages\somoclu\train.py", line 218, in train self.update_data(data)
File "C:\Users\ganem\AppData\Local\Programs\Python\Python36\Lib\site-packages\somoclu\train.py", line 243, in update_data self._data = np.float32(data)
ValueError: setting an array element with a sequence.'
It seems like the function does not recognize the csr_matrix as a valid dtype and then tries to cast it, with no success.
Update 2: I cannot transform the sparse amtrix into np.array since it wont fit in memory