ALS missing _knn after unpickling on GPU

mdekstrand commented 1 year ago

I'm putting the finishing touches on the new LensKit-Implicit bridge, and mostly have it working, except for pickling on the GPU.

The _knn field is not initialized, even to None, when a GPU AlternatingLeastSquares object is unpickled.

I suspect moving the default None initializers out of __init__ and into the class body on MatrixFactorizationBase would fix this - then they're defined on the class, so they're available without initialization of the specific object, and will be properly overridden when assigned in e.g. the knn accessor.

LensKit requires pickling for its parallel evaluation support (although parallel evaluation with a GPU model may or may not work - that would further need to be tested).

mdekstrand commented 1 year ago

This can be reproduced with the LensKit Implicit test suite on a CUDA-enabled system: https://github.com/lenskit/lenskit-implicit

See https://github.com/lenskit/lkpy/wiki/DevWorkflow for environment setup info (or just flit install the dependencies).

benfred commented 1 year ago

thanks for the bug report! I've reproduced, and have a proposed fix in #632

mdekstrand commented 1 year ago

Great, thanks! Can confirm that works. Unpickling in a subprocess doesn't yet, but will open a bug on my side for that.

benfred commented 1 year ago

I have some more fixes in https://github.com/benfred/implicit/pull/636 - was trying to diagnose https://github.com/lenskit/lenskit-implicit/issues/6, and noticed that there were still some issues with the pickle code with GPU models. Unfortunately this didn't fix the issue you were seeing with subprocesses =(

benfred commented 1 year ago

Fixes are in v0.6.2

benfred / implicit

ALS missing _knn after unpickling on GPU #631