OryxProject / oryx

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
http://oryx.io
Apache License 2.0
1.79k stars 405 forks source link

question about "fold-in" implementation #360

Closed tianpunchh closed 3 years ago

tianpunchh commented 5 years ago

Hi Sean,

Recently I looked at your blogs about "fold-in" algo for recommender, I am very interested. I started to look at your source code trying to understand how to implement this exactly. In the code, I have a doubt right here:

computeUpdatedXu(Solver solver, double value, float[] Xu, float[] Yi, boolean implicit)

I think Xu is the vector for user u and Yi is the vector for item i, therefore for a new interaction, the updated Xu can be solved in the following equation dXuYi'Yi = dQuiYi

the code calls solver.solveDToD(dQuiYi) for decomposition form like Ax=B to get dXu, but the confusion comes from here: it looks to me the solver is not using Yi'Yi as A, but instead it is instantiated by Y'Y in which Y is the item feature matrix. So it appears to me, you are trying to solve this dXuY'Y = dQuiYi

Could you please comment on this? I am not sure if I miss something important here.

Regards, Tian

srowen commented 5 years ago

See http://yifanhu.net/PUB/cf.pdf and equation (4), which is what this is trying to implement. The solver is indeed solving for a variation on this equation, something like Qu*Y = Xu * (Yt * Y). The gramian term should be of the whole item matrix.

I admit it's been a long time since I looked at this code and it'd take a while to recall how the whole implementation works, but I think that much is right.

tianpunchh commented 5 years ago

I see your point after looking at the original paper and your equation again. I think right now I understand your logic now, I hope I am right. The equation starts with

Qu*Y = Xu * (Yt * Y)

originally Qu should be the user interaction strength to all items, i.e., Qu is a vector of n rows representing n items. Now the assumption is that an interaction for item i has been updated while others not, that's why the "value" argument in function computeUpdatedXu() is a scalar double. What the code calculates is actually the change of Qu

dQu*Y = dXu * (Yt * Y)

The key is dQu has only one non-zero element dQui corresponds to item i updated by "value" input. And dQu*Y = dQui * Yi, so this yields

dQui*Yi = dXu * (Yt * Y) where in the beginning I felt like the left and right hand side seems not consistent in Y or Yi

Regards, Tian