Closed hugocool closed 2 years ago
I'm pretty sure the tutorial works, but you will need the latest version of implicit installed.
There were many breaking API changes in v0.5.0, and the tutorial is built against the newer API. It looks to me like you have an older version of implicit installed based off the error messages you're reporting.
Can you verify that you have the latest version of implicit installed? What does this print out ?
import implicit
print(implicit.__version__)
ahh, okay i just checked and the version I was on was 0.4.4
But the install hangs for some reason, so when I run
!pip install implicit --upgrade
in a cloud notebook(whether it is colab, kaggle or sagemaker), the install cell just hanged for half an hour.
Any ideas why installing the new version doesn't work?
The old version installed just fine.
oh That leaves the 2*user_plays though, why is that necessary?
!pip install implicit --upgrade in a cloud notebook(whether it is colab, kaggle or sagemaker), the install cell just hanged for half an hour.
Using pip will compile from source right now, which can take a long amount of time. We're tracking uploading prebuilt binaries to pip here https://github.com/benfred/implicit/issues/539.
One thing you do to speed up compilation is to only build for the current GPU architecture. There are some tips here https://github.com/benfred/implicit/issues/537
That leaves the 2*user_plays though, why is that necessary?
The '2' corresponds to the alpha parameter in the original paper. This is giving more weight to positive examples.
thanks for the clarification!
Would it be a good idea to add a note to the documentation (for example the top of the tutorial) to point out one should install the latest version with a specified set of flags? This could save you the trouble of replying to these issues (for which I am obviously grateful, thanks for this package!)
I've add binary wheels to pypi - you should be able to install implicit on colab/kaggle etc in a couple seconds now, with the GPU extension built.
Would it be a good idea to add a note to the documentation (for example the top of the tutorial) to point out one should install the latest version with a specified set of flags?
The API hopefully won't change again soon - lets wait and see how many times this occurs =).
Glad you're finding this useful!
In the tutorial for this package there are several issues. according to the tutorial code:
The first is
Why 2* user_plays? If there is a reason the confidence weights should be doubled for this implementation of the algorithm it should be documented.
Secondly
results in
idk exactly what is going here. It could be that a newer version of numpy/scipy is yielding a different shape of sparse arrays then you expected?
In addition, the recommend similar items code:
results in
Because model.similar_items(252512) yields
so these arrays should be unpacked and rezipped as follows:
ids, scores=zip(*model.similar_items(252512))
Now showing the similar artists should be:
display the results using pandas for nicer formatting
pd.DataFrame({"artist": artists[list(ids)], "score": scores})
However this results in out of bound indices:
IndexError: index 322476 is out of bounds for axis 0 with size 292385
because the recommender is recommending items that don't exist. Also the in-bound recommendations don't make any sense, nor do they conform to the expected output..