Formula used for training loss

mohelm commented 5 years ago

Hi Ben,

first many thanks for the library. It is an amazing piece of work! :-)

I have a quick question on the calculation of the training loss (for a case where every element in Ciu>=0, i.e. no negative feedback):

I believe the nominator is given by (Ciu (Piu - scores)^2).sum().sum() + lambda (frobenius-norm(item_factors) + frobenius-norm(user_factors))

where

Ciu is the confidence matrix with zeros (so items with zero confidence vanish in the sum)
Piu is 0/1 matrix indicating which item was watched
scores = item_factors * user_factors.transpose

Then you divide by: (Ciu+1).sum().sum()

Is this true, because I cannot recover the training loss "manually" for a simple case? For instance, are you actually computing the Frobenius norm? Because I can not see where you take the root?

mohelm commented 5 years ago

Hi Ben, to make things a bit more concrete, I use:


def compute_mse(ciu, item_factors, user_factors, regularization):
     '''
    Cui: confidence matrix with zeros (this is cui - 1) in the paper
    item_factors: item factors
    user_factors: user factors
    regularization: lambda parameter

    loss: MSE
    object: the value of the objective function i.e. loss + regularization * (...) 

    '''

    ciu_dense = ciu.toarray()
    p = (ciu_dense > 0) 
    scores = (item_factors @ user_factors.transpose())

    user_factor_norm = np.linalg.norm(user_factors, axis=1).sum()
    item_factor_norm = np.linalg.norm(item_factors, axis=1).sum()

    normalizer = ciu_dense.sum().sum() + ciu_dense.shape[0] * ciu_dense.shape[1] - p.sum().sum() 

    loss = ((1/normalizer) * (np.multiply(ciu_dense+1, ((p - scores)**2)).sum().sum()))

    reg = (1/normalizer) * regularization * (item_factor_norm + user_factor_norm) 

    objective = loss + reg

    return loss, objective

and I try to use it on

spcsr = sparse.csr_matrix([[1, 1, 0, 1, 0, 0],
                             [0, 1, 1, 1, 0, 0],
                             [1, 4, 1, 0, 7, 0],
                             [1, 1, 0, 0, 0, 0],
                             [9, 0, 4, 1, 0, 1],
                             [0, 1, 0, 0, 0, 1],
                             [0, 0, 2, 0, 1, 1]],
                           dtype=np.float64)

n_users = spcsr.shape[1]
n_items = spcsr.shape[0]

many thanks

Jesse-Kerr commented 5 years ago

Isn’t the ciu equal to one for non-occurrences? Because we multiply by alpha and then add 1. At least that’s how it is in the original publication. Not sure about how this library is implementing this, thiugh

benfred / implicit

Formula used for training loss #190