benfred / implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets
https://benfred.github.io/implicit/
MIT License
3.57k stars 612 forks source link

Having issues with saving/loading model #601

Closed bavaria95 closed 2 years ago

bavaria95 commented 2 years ago

Hey!

Trying to run the following code:

model = implicit.als.AlternatingLeastSquares(factors=20, iterations=2)
model.fit(matrix)

with open('test.mdl', 'w') as f:
    model.save(f)

It fails with the following error:

File /usr/lib/python3.8/zipfile.py:1614, in ZipFile._open_to_write(self, zinfo, force_zip64)
   1611 self._writecheck(zinfo)
   1612 self._didModify = True
-> 1614 self.fp.write(zinfo.FileHeader(zip64))
   1616 self._writing = True
   1617 return _ZipWriteFile(self, zinfo, zip64)

TypeError: write() argument must be str, not bytes

Then, I'm able to save it to a byte stream:

obj = io.BytesIO()
model.save(obj)

But then it's not clear how to load it. In #577 the loading functionality has been moved into RecommenderBase. So trying to load it with this fails:

obj.seek(0)
model = recommender_base.RecommenderBase.load(obj)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [48], in <cell line: 1>()
----> 1 recommender_base.RecommenderBase.load(obj)

File ~/.local/share/virtualenvs/.spark-kernel-7xI1WBLh/lib/python3.8/site-packages/implicit/recommender_base.py:192, in RecommenderBase.load(cls, fileobj_or_path)
    190     fileobj_or_path = fileobj_or_path + ".npz"
    191 with np.load(fileobj_or_path, allow_pickle=False) as data:
--> 192     ret = cls()
    193     for k, v in data.items():
    194         if k == "dtype":

TypeError: Can't instantiate abstract class RecommenderBase with abstract methods fit, recommend, save, similar_items, similar_users
benfred commented 2 years ago

For saving, we need a file opened in binary mode rather than text mode - so you need to pass the b mode to the open call :

# write out the model to a binary file object
with open('test.mdl', 'wb') as f:
    model.save(f)

# alternatively, can just specify the path
model.save("filename")

To load up a model, you need to use the subclass to load up from a file:

model = implicit.cpu.als.AlternatingLeastSquares.load("filename")
kgneng2 commented 1 year ago

@benfred

hi,

model = AlternatingLeastSquares(
            factors=10,
            regularization=0.1,
            iterations=1,
            # use_gpu=False,
            num_threads=0
        )

# Fit the model to the ratings data
model.fit(csr)

model.save('model.mdl')

I use implicit version is

implicit                  0.5.2           py310h80e0b47_1    conda-forge

but. The error is comming

'AlternatingLeastSquares' object has no attribute 'save'

why?..

and I want to update implict version latest(0.6.2) But this tag is not found.

conda install implict=0.6.2 

reference

https://benfred.github.io/implicit/api/models/cpu/als.html