I was applying implicit's AlternatingLeastSquares with CUDA on a matrix of shape (4091537, 5256984). The number of factors was set to 1011.
In doing so, AlternatingLeastSquares was failing with the following traceback:
Traceback (most recent call last):
File "/path/to/my/project/demo.py", line 18, in <module>
model.fit(customer_product_matrix)
File "/path/to/my/venv/lib/python3.9/site-packages/implicit/gpu/als.py", line 162, in fit
self.solver.least_squares(Ciu, Y, _XtX, X, self.cg_steps)
File "_cuda.pyx", line 258, in implicit.gpu._cuda.LeastSquaresSolver.least_squares
RuntimeError: Cuda Error: an illegal memory access was encountered (/project/implicit/gpu/als.cu:196)
terminate called after throwing an instance of 'std::runtime_error'
what(): Cuda Error: an illegal memory access was encountered (/project/implicit/gpu/matrix.cu:246)
Aborted (core dumped)
The last line in the stack trace is a follow-up error referring to the destructor of CSRMatrix.
I tracked the original error down to the CUDA kernel. To be precise, to the lines 38 and 39 in als.cu (referring to implicit version 0.7.2):
for (int u = blockIdx.x; u < user_count; u += gridDim.x) {
T *x = &X[u * factors];
Both u and factors are of type int and u may go up to user_count-1. Considering the aforementioned matrix dimensions, there are values for u where this product exceeds the maximum value for integers (2147483647) in C++ leading to a negative array index.
By using size_t as type for u, this can be prevented.
Side note: it is remarkable that the second call to least_squares (line 162 in als.py) is failing even though the first call (line 159 in als.py) already produces negative array indexes.
I was applying implicit's AlternatingLeastSquares with CUDA on a matrix of shape (4091537, 5256984). The number of factors was set to 1011.
In doing so, AlternatingLeastSquares was failing with the following traceback:
The last line in the stack trace is a follow-up error referring to the destructor of CSRMatrix.
I tracked the original error down to the CUDA kernel. To be precise, to the lines 38 and 39 in als.cu (referring to implicit version 0.7.2):
Both
u
andfactors
are of typeint
andu
may go up touser_count-1
. Considering the aforementioned matrix dimensions, there are values foru
where this product exceeds the maximum value for integers (2147483647) in C++ leading to a negative array index.By using
size_t
as type foru
, this can be prevented.Side note: it is remarkable that the second call to
least_squares
(line 162 in als.py) is failing even though the first call (line 159 in als.py) already produces negative array indexes.