benfred / implicit

Fast Python Collaborative Filtering for Implicit Feedback Datasets
https://benfred.github.io/implicit/
MIT License
3.55k stars 611 forks source link

Idea for making the preference matrix as a separate input #130

Closed franktma closed 6 years ago

franktma commented 6 years ago

Hi @benfred, all,

Thanks for building this really nice package. It is helping me tremendously on the work I do with clickstream data (clicks, orders).

I have one improvement to the model I'd like to try out: make the preference matrix as a separate input.

Here's the rationale: In my data (and I'd image in other cases) some of the low r_ui values actually have a high confidence --- we are really sure that this item is not liked by the user. However in the current implementation, if an r_ui has low value it automatically gets a low confidence. If we make the preference matrix an arbitrary input, instead of being derived from C_ui, that would solve the problem.

From the paper I do not see any problem with the math in making C_ui different from p_ui. So I plan to try it. Eg by changing here (if I understood the code correctly): https://github.com/benfred/implicit/blob/master/implicit/als.py#L336

Do you think I'm understanding the issue correctly? Does the idea make sense to you?

Frank

franktma commented 6 years ago

I just realized that this was already addressed in https://github.com/benfred/implicit/issues/114, negative scores is what I was look for and I see that it is in the latest repo! Thanks for that update!!