How to exclude some ratings from implicit feedback, and at the same time, to leave it for user latent representation?

yustiks commented 3 years ago

I would like to change the minimisation function for implicit feedback. Namely, I would like to delete values of rating equal to 4 from implicit feedback, at the same time, I would like to have this information as a training data (but I do not want to consider it as implicit information).

How to implement it?

Praful932 commented 3 years ago

Interesting, So you mean in the forward pass you don't want to consider ratings which have 4 as value(implicit) but would like to have it in the training data, Since you don't want it to be considered in the implicit feedback, you can control that in the implicit_feedback function Is that what you mean?

yustiks commented 3 years ago

@Praful932 yes, true. I try to create different data for implicit_feedback and fit methods, but I got an error. which probably means, I can not do it. In my case, rating of 4 is considered as non-implicit information, therefore, we don't want to add their latent representation into the minimization function

Praful932 commented 3 years ago

Instead of creating different data, why don't you try modifying the function to add only the rating that are not 4 to self.user_rated_items, I believe that would work

yustiks commented 3 years ago

@Praful932 I modified the code, and what I have is the nan values:

991/4619 [=====>........................] - ETA: 4:14 - loss: nan - mse: nan

yustiks commented 3 years ago

@Praful932 and this is what I added to the function (y is list of ratings for train data):

def implicit_feedback(self, X, y):
        """Maps a user to rated items for implicit feedback.

        Needs to be called before fitting the Model.

        Parameters
        ----------
        X : numpy.ndarray
            User Item table.

        Raises
        ------
        AttributeError
            If this is not called before calling `fit()`.
        """

        self.user_rated_items = [[] for _ in range(self.n_users)]
        for u, i, rating in zip(X[:, 0], X[:, 1], y):
            if rating != 4:
                self.user_rated_items[u].append(i)

Praful932 commented 3 years ago

Hi @yustiks I believe that would be something that would have to do with the dataset that you are using, Here's an example with the movielens dataset

from tfrec.models import SVD
from tfrec.datasets import fetch_ml_100k
from tfrec.utils import preprocess_and_split
from tfrec.models import SVDpp
import numpy as np
import tensorflow as tf

data = fetch_ml_100k()
dataset, user_item_encodings = preprocess_and_split(data)

(x_train, y_train), (x_test, y_test) = dataset
(user_to_encoded, encoded_to_user,item_to_encoded, encoded_to_item) = user_item_encodings

num_users = len(np.unique(data['userId']))
num_movies = len(np.unique(data['movieId']))
global_mean = np.mean(data['rating'])

class CustomSVDpp(SVDpp):
    def implicit_feedback(self, X, y):
        self.user_rated_items = [[] for _ in range(self.n_users)]
        for u, i, rating in zip(X[:, 0], X[:, 1], y):
            if rating != 4:
                self.user_rated_items[u].append(i)

        # Converts to ragged tensor to be used during forward pass
        self.user_rated_items = tf.ragged.constant(
            self.user_rated_items, dtype=tf.int32)

model = CustomSVDpp(num_users, num_movies, global_mean)
# Needs to be called before fitting
model.implicit_feedback(x_train, y_train.flatten())
model.compile(loss = 'mean_squared_error', optimizer = 'adam')

model.fit(x_train, y_train)

2521/2521 [==============================] - 34s 13ms/step - loss: 1.2576

Praful932 commented 3 years ago

@yustiks Does this solve your issue

Praful932 commented 2 years ago

Closing this

Praful932 / Tf-Rec

How to exclude some ratings from implicit feedback, and at the same time, to leave it for user latent representation? #1