Open mohelm opened 5 years ago
Hi Ben, to make things a bit more concrete, I use:
def compute_mse(ciu, item_factors, user_factors, regularization):
'''
Cui: confidence matrix with zeros (this is cui - 1) in the paper
item_factors: item factors
user_factors: user factors
regularization: lambda parameter
loss: MSE
object: the value of the objective function i.e. loss + regularization * (...)
'''
ciu_dense = ciu.toarray()
p = (ciu_dense > 0)
scores = (item_factors @ user_factors.transpose())
user_factor_norm = np.linalg.norm(user_factors, axis=1).sum()
item_factor_norm = np.linalg.norm(item_factors, axis=1).sum()
normalizer = ciu_dense.sum().sum() + ciu_dense.shape[0] * ciu_dense.shape[1] - p.sum().sum()
loss = ((1/normalizer) * (np.multiply(ciu_dense+1, ((p - scores)**2)).sum().sum()))
reg = (1/normalizer) * regularization * (item_factor_norm + user_factor_norm)
objective = loss + reg
return loss, objective
and I try to use it on
spcsr = sparse.csr_matrix([[1, 1, 0, 1, 0, 0],
[0, 1, 1, 1, 0, 0],
[1, 4, 1, 0, 7, 0],
[1, 1, 0, 0, 0, 0],
[9, 0, 4, 1, 0, 1],
[0, 1, 0, 0, 0, 1],
[0, 0, 2, 0, 1, 1]],
dtype=np.float64)
n_users = spcsr.shape[1]
n_items = spcsr.shape[0]
many thanks
Isn’t the ciu equal to one for non-occurrences? Because we multiply by alpha and then add 1. At least that’s how it is in the original publication. Not sure about how this library is implementing this, thiugh
Hi Ben,
first many thanks for the library. It is an amazing piece of work! :-)
I have a quick question on the calculation of the training loss (for a case where every element in Ciu>=0, i.e. no negative feedback):
I believe the nominator is given by (Ciu (Piu - scores)^2).sum().sum() + lambda (frobenius-norm(item_factors) + frobenius-norm(user_factors))
where
Then you divide by: (Ciu+1).sum().sum()
Is this true, because I cannot recover the training loss "manually" for a simple case? For instance, are you actually computing the Frobenius norm? Because I can not see where you take the root?