[Question] Troubleshooting a Sequential Ranking Model to Predict Probability of Purchase

datasciyj commented 1 year ago

Hi @patrickorlando,

I asked another question yesterday in [https://github.com/tensorflow/recommenders/issues/618] As I mentioned in the previous issue, I'm trying to create a sequential ranking model with retail data. Unlike the ranking model tutorial, I want my ranking model to predict probability of purchase for each product since I don't have rating information. So my data looks like

{'purchase history': [[b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'990365809', b'631'], [b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'203', b'11', b'245'] ... , 'latest purchase': [b'800,b'23', ...] , 'label':[1,1, ... ] }

I used 'purchase history' as query data, 'latest purchase' as a candidate data, and 'label' as labels when calculating loss. I put 1s in 'label' for all rows because I thought the probability of purchase is 1 for all the latest purchase, but it seems like my assumption is incorrect. When I used this data for the ranking model, the loss got lower over the epochs but the accuracy was 1 and auc was 0 for test data. I used binary cross entropy for loss calculation and sigmoid activation function for the last dense layer. What should I change to make the model predict probability of purchase for each product? Any comments would be appreciated!

The following is the model I used.

embedding_dimension = 32

query_model = tf.keras.Sequential([
    tf.keras.layers.StringLookup(
      vocabulary=unique_product_ids, mask_token=None),
    tf.keras.layers.Embedding(len(unique_product_ids) + 1, embedding_dimension), 
    tf.keras.layers.GRU(embedding_dimension)

])

candidate_model = tf.keras.Sequential([
  tf.keras.layers.StringLookup(
      vocabulary=unique_product_ids, mask_token=None),
  tf.keras.layers.Embedding(len(unique_product_ids) + 1, embedding_dimension)
])

class RankingModel(tf.keras.Model):

  def __init__(self):
    super().__init__()
    embedding_dimension = 32

    self._query_model = query_model
    self._candidate_model = candidate_model

    # Compute predictions.
    self.prob = tf.keras.Sequential([
      # Learn multiple dense layers.
      tf.keras.layers.Dense(256, activation="relu"),
      tf.keras.layers.Dense(64, activation="relu"),
      # Make probability predictions in the final layer.
      tf.keras.layers.Dense(1, activation='sigmoid')
  ])

  def call(self, inputs):

    purchase_history, candidates_retrieval = inputs

    query_embedding = self._query_model(purchase_history)
    candidate_embedding = self._candidate_model(candidates_retrieval)

    return self.prob(tf.concat([query_embedding, candidate_embedding], axis=1))

class NextitemModel(tfrs.models.Model):

  def __init__(self):
    super().__init__()
    self.ranking_model: tf.keras.Model = RankingModel()
    self.task: tf.keras.layers.Layer = tfrs.tasks.Ranking(
    loss = tf.keras.losses.BinaryCrossentropy(),
    metrics=[
      tf.keras.metrics.AUC(name='auc'),
      tf.keras.metrics.BinaryAccuracy(name="accuracy"),
      ]
)

  def call(self, features: Dict[str, tf.Tensor]) -> tf.Tensor:
    return self.ranking_model(
        (features["purchase history"], features["latest purchase"]))

  def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
    labels = features.pop("label")
    prob_predictions = self(features)

    # The task computes the loss and the metrics.
    return self.task(labels=labels, predictions=prob_predictions)

rankmodel = NextitemModel()
rankmodel.compile(optimizer=tf.keras.optimizers.Adagrad(learning_rate=0.1))

rankmodel.fit(train_dataset, epochs=3, verbose=2)

patrickorlando commented 1 year ago

Hey @datasciyj,

Your ranking model dataset only contains positive examples. A classifier that predicts 1 for any input would achieve a perfect score.

You need to have negative examples in your dataset as well. In the retrieval stage these are sampled from the other items in the batch, but this is not what happens in the ranking stage.

You should create a dataset containing explicit negatives. Items the user clicked or added to cart but didn't purchase.

datasciyj commented 1 year ago

Thank you @patrickorlando for your quick response!

I have almost a million of unique products and don't have any information about clicks or adding to carts. In this case, I might have to add negative examples for each user something like

user id | purchase history | latest purchase | label -- | -- | -- | -- 1 | [b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'990365809', b'631'] | b'800' | 1 | [b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'0', b'990365809', b'631'] | b'1' | 0 | | all the remaining products

I think adding negative examples like this might not be proper for my case because each user might have million rows, right? I'd like to get top 5 products which has highest predicted probability of purchase for each customer. In my case, can I just achieve this goal only by retrieval model and don't need a ranking model?

patrickorlando commented 1 year ago

It's perfectly fine to only use a retrieval model, it also simplifies getting your model into production and reduces response latency.

There are advantages to ranking models, particularly if you have rich features about your items. You could try randomly sampling k negatives per positive example, but whether it results in better performance than the retrieval model alone is something you can only discover empircally.

You also might decide to start collecting click events to build a ranking model in the future.

Ullar-Kask commented 1 year ago

Hi @patrickorlando,

You could try randomly sampling k negatives per positive example

What is the reasonable value of k ?

Thanks!

patrickorlando commented 1 year ago

@Ullar-Kask, I don't know of any rule of thumb here. I think it's something you'd need to experiment with. It probably depends on the number of items in your catalogue and their frequency distribution.

tensorflow / recommenders

[Question] Troubleshooting a Sequential Ranking Model to Predict Probability of Purchase #620