benfred / implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets
https://benfred.github.io/implicit/
MIT License
3.57k stars 612 forks source link

bug: TypeError when converting a gpu model to cpu then to gpu #597

Closed lfaucon closed 2 years ago

lfaucon commented 2 years ago

Steps to replicate

  1. Train a model implicit.gpu.als.AlternateLeastSquares
  2. call to_cpu then to_gpu on it
  3. calling recalculate_user now causes the error TypeError: __cinit__() takes exactly 1 positional argument (2 given)

Full example:

> als_model
<implicit.gpu.als.AlternatingLeastSquares object at 0x7fdcbc153460>
> als_model.recalculate_user(list(range(50)), u_i_matrix)
Matrix([[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]])
> als_model = als_model.to_cpu().to_gpu()
> als_model.recalculate_user(list(range(50)), u_i_matrix)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/storage/lofauco/recommender-systems/venv-recsys/lib64/python3.8/site-packages/implicit/gpu/als.py", line 176, in recalculate_user
    Cui, user_factors, self.YtY, self.item_factors, cg_steps=self.factors
  File "/storage/lofauco/recommender-systems/venv-recsys/lib64/python3.8/site-packages/implicit/gpu/als.py", line 261, in YtY
    self._YtY = implicit.gpu.Matrix(self.factors, self.factors)
  File "_cuda.pyx", line 87, in implicit.gpu._cuda.Matrix.__cinit__
TypeError: __cinit__() takes exactly 1 positional argument (2 given)

Note: The double conversion is not a common usecase. I mention it here only as the simplest way to replicate the error. The error occurs also when saving and loading the model.

Proposed fix

The class implicit.gpu.Matrix defined here https://github.com/benfred/implicit/blob/main/implicit/gpu/_cuda.pyx#L87 is used here incorrectly https://github.com/benfred/implicit/blob/main/implicit/gpu/als.py#L271

XtX and YtY should be initialized as self._XtX = implicit.gpu.Matrix.zeros(self.factors, self.factors)

benfred commented 2 years ago

Thanks for the bug report! It looks like the existing test suite misses this - we have tests to_cpu/to_gpu conversion, as well as the save/load code, but there aren't any tests that test out the recalculate_user/item code after model.save or to_cpu . It seems like this bug has existed since I originally added the recalculate_user/item support to the GPU ALS model https://github.com/benfred/implicit/pull/515 =(

I have a fix in https://github.com/benfred/implicit/pull/598