Usage of FactorizedTopK metric

HansWurst90 commented 3 years ago

Hello everyone,

I looked at all the quickstart tutorials and used the basic_retrieval example to adjust it to my dataset. views_df contains pairs of user_ids and content_ids and represent when a user viewed a content.

Dataset and Result

The dataset is fairly small (1026 views from 63 users on 187 contents) but the code seems to work and my results are as follows:

Train:

factorized_top_k/top_1_categorical_accuracy: 0.0012 factorized_top_k/top_5_categorical_accuracy: 0.0816 factorized_top_k/top_10_categorical_accuracy: 0.2046 factorized_top_k/top_50_categorical_accuracy: 0.7430 factorized_top_k/top_100_categorical_accuracy: 0.8965 loss: 494.7287

Test:

factorized_top_k/top_1_categorical_accuracy: 0.0 factorized_top_k/top_5_categorical_accuracy: 0.0243 factorized_top_k/top_10_categorical_accuracy: 0.0585 factorized_top_k/top_50_categorical_accuracy: 0.3804 factorized_top_k/top_100_categorical_accuracy: 0.6146 loss: 31.29269790649414,

Question

I am unsure if I created the query embeddings and candidate embeddings correctly from my dataset for the calculation of the metrics FactorizedTopK metric. I am also having trouble to unterstand the computation of theFactorizedTopK metric in general. I looked at the source code but don't understand the explanation of how it is calculated.

The main argument are pairs of query and candidate embeddings: the first row of query_embeddings denotes a query for which the candidate from the first row of candidate embeddings was selected by the user. The task will try to maximize the affinity of these query, candidate pairs while minimizing the affinity between the query and candidates belonging to other queries in the batch.

Where does it take the ground truth from? Aren't the query and candidate embeddings just lists of all users and contents? Is the order of the list of importance? Can someone explan the computation of the FactorizedTopK metric in simpler terms?

Thanks in advance

Code

import os
import pprint
import tempfile
from typing import Dict, Text
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds
import tensorflow_recommenders as tfrs
import pandas as pd
import matplotlib.pyplot as plt

# Variables
seed = 42
test_percentage = 20
train_percentage = 100-test_percentage
embedding_dimension = 32 # 64 ?
metrics_batchsize = 16
train_batchsize = 128
test_batchsize = 64
learning_rate = 0.1 # 0.5 ?
epochs = 3
index_batchsize = 100

views_df = pd.read_csv('filepath')

views_df = views_df[['user_id','content_id']]
users_df = views_df['user_id'].unique()
contents_df = views_df['content_id'].unique()

views_ds = tf.data.Dataset.from_tensor_slices(dict(views_df))
contents_ds = tf.data.Dataset.from_tensor_slices(contents_df)

view_size = len(views_df)
train_size = round(view_size/100*train_percentage)
test_size = view_size-train_size

tf.random.set_seed(seed)

views_ds_shuffled = views_ds.shuffle(len(views_df), seed=seed, reshuffle_each_iteration=False)

train = views_ds_shuffled.take(train_size)
test = views_ds_shuffled.skip(train_size).take(test_size)

user_model = tf.keras.Sequential([
  tf.keras.layers.experimental.preprocessing.IntegerLookup(vocabulary=users_df, mask_value=None),
  tf.keras.layers.Embedding(input_dim=len(users_df) + 1, output_dim=embedding_dimension)
])

content_model = tf.keras.Sequential([
  tf.keras.layers.experimental.preprocessing.IntegerLookup(vocabulary=contents_df, mask_value=None),
  tf.keras.layers.Embedding(input_dim=len(contents_df) + 1, output_dim=embedding_dimension)
])

candidates=contents_ds.batch(metrics_batchsize).map(content_model)

metrics = tfrs.metrics.FactorizedTopK(
  candidates=candidates
)

task = tfrs.tasks.Retrieval(
  metrics=metrics
)

class ContentModel(tfrs.Model):

  def __init__(self, user_model, content_model):
    super().__init__()
    self.content_model: tf.keras.Model = content_model
    self.user_model: tf.keras.Model = user_model
    self.task: tf.keras.layers.Layer = task

  def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
    content_embeddings = self.content_model(features["content_id"])
    user_embeddings = self.user_model(features["user_id"])

    return self.task(user_embeddings, content_embeddings)

model = ContentModel(user_model, content_model)
model.compile(optimizer=tf.keras.optimizers.Adagrad(learning_rate=learning_rate))

cached_train = train.shuffle(view_size).batch(train_batchsize).cache()
cached_test = test.batch(test_batchsize).cache()

model.fit(cached_train, epochs=epochs)

model.evaluate(cached_test, return_dict=True)

index = tfrs.layers.factorized_top_k.BruteForce(model.user_model)
index.index(contents_ds.batch(index_batchsize).map(model.content_model), contents_ds)

user_id = 2
_, contents = index(tf.constant([user_id]))
print(f"Recommendations for user {user_id}: {contents}")

maciejkula commented 3 years ago

Are you asking about the retrieval loss, or the retrieval metric?

Assuming you are asking about the loss, the general idea of retrieval models is that user clicks (etc) are the ground truth. If a user clicked on an item, the model should predict a high score for that item. Anything the user did not click on should receive a lower score.

HansWurst90 commented 3 years ago

My Question is about how the factorized_top_k accuracy metric is calculated internally. I'm having a hard time understanding it from looking at the source code. Can you explain in in easier terms than the comments in the source code?

krxat commented 3 years ago

@HansWurst90 https://github.com/tensorflow/recommenders/issues/279#issue-863812545

tensorflow / recommenders