maciejkula / spotlight

Deep recommender models using PyTorch.
MIT License
2.99k stars 427 forks source link

item-item similarity #108

Open mszep opened 6 years ago

mszep commented 6 years ago

I'm trying to implement both recommendations and item-item similarities based on an ImplicitFactorizationModel trained on clickstream data. The former is trivial using the predict method, but the latter I'm not quite sure about. Looking at the code, I could obtain the item embeddings in a similar way to how it's done in _components.py and representations.py, and then apply a similarity measure to the resulting torch tensors.

Is this a sensible approach / Is there a better way to do this / Is it even possible?

maciejkula commented 6 years ago

Yes, this sounds fine! May I ask why you use the factorization model, where the sequence model would probably give much better results and be faster to train?

mszep commented 6 years ago

To be honest, it's mostly limited knowledge on my part -- I'm more familiar with factorization models from previous work and thought I'd start there. We have had challenges with training on our full set, but have managed so far. 2-3 hours of training seems to yield a model with sensible results.

Would you say that sequence models are a better way to go in general, and that the factorization model should only be used if the order of the implicit feedback samples is not given?

maciejkula commented 6 years ago

I strongly recommend using a GPU for sequence models if you aren't already. The PyTorch CPU implementation is quite slow.

I think that sequence models are definitely a better way to go. They are more flexible and allow you to easily incorporate new interactions as they come in, without retraining your model.

mjwestcott commented 6 years ago

Hi @maciejkula, thanks for this great project and for LightFM too. I have a couple of questions connected to this discussion.

First, as I understand it one of the benefits of matrix factorisation is that it can also be applied to the implicit interaction matrix's transpose, i.e. to the problem of recommending users to items (rather than items to users). For example, we might want to find the top N users who would be interested in an item who haven't interacted with it before. Is that correct and are you aware of a way to use sequence models to solve this problem?

Second, you mention sequence models allow to you incorporate new interactions as they come in. Just to clarify, do you mean simply that the model can make predictions for new sequences without needing to be retrained?

maciejkula commented 6 years ago

Hi @mjwestcott

  1. I don't see why this would be impossible: the problems are symmetric. Depending on your needs, you could try a sequence model that predicts the next user given an item's history.
  2. Yes, that's right. This makes it very easy to incorporate new users or new interactions for existing users.
denyslazarenko commented 6 years ago

Hi @maciejkula,

I am using CNNNet and my aim - obtain the embedding of items. After that, I would like to measure the similarity between items.
Now I am a little bit confused, I have the following code

CNNNet(
  (item_embeddings): ScaledEmbedding(58129, 32, padding_idx=0)
  (item_biases): ZeroEmbedding(58129, 1, padding_idx=0)
  (cnn_0): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1))
  (cnn_1): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(2, 1))
  (cnn_2): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(4, 1))
  (cnn_3): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(8, 1))
  (cnn_4): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(16, 1))
  (cnn_5): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(32, 1))
  (cnn_6): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(64, 1))
  (cnn_7): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(128, 1))
  (cnn_8): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1))
  (cnn_9): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(2, 1))
)

Now i would like to make several iterations:

for _ in range(10):
    net.forward(train)

forward requires fpllowing Compute predictions for target items given user representations -> therefore I would like to call method user_representation which requires item_sequences. I uploadet a dataset <Sequence interactions dataset (209 sequences x 200 sequence length)> and would like to pass this object into user_representation, however I got error: AttributeError: 'SequenceInteractions' object has no attribute 'contiguous'.
What I am going wrong? Maybe you have the code example of the idea which i would like to implement, because it seems to be a common need.

maciejkula commented 6 years ago

The forward function accepts tensors rather than the Interactions objects. Have a look at how the Spotlight models do it.

On Tue, 31 Jul 2018, 05:52 Denys Lazarenko, notifications@github.com wrote:

Hi @maciejkula https://github.com/maciejkula,

I am using CNNNet and my aim - obtain the embedding of items. After that, I would like to measure the similarity between items. Now I am a little bit confused, I have the following code

CNNNet( (item_embeddings): ScaledEmbedding(58129, 32, padding_idx=0) (item_biases): ZeroEmbedding(58129, 1, padding_idx=0) (cnn_0): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1)) (cnn_1): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(2, 1)) (cnn_2): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(4, 1)) (cnn_3): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(8, 1)) (cnn_4): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(16, 1)) (cnn_5): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(32, 1)) (cnn_6): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(64, 1)) (cnn_7): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(128, 1)) (cnn_8): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1)) (cnn_9): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(2, 1)) )

Now i would like to make several iterations:

for _ in range(10): net.forward(train)

forward requires fpllowing Compute predictions for target items given user representations -> therefore I would like to call method user_representation which requires item_sequences. I uploadet a dataset <Sequence interactions dataset (209 sequences x 200 sequence length)> and would like to pass this object into user_representation, however I got error: AttributeError: 'SequenceInteractions' object has no attribute 'contiguous'. What I am going wrong? Maybe you have the code example of the idea which i would like to implement, because it seems to be a common need.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/maciejkula/spotlight/issues/108#issuecomment-409209939, or mute the thread https://github.com/notifications/unsubscribe-auth/ACSCA5uEbBeqrJu2EFfBXttdEilKM-8iks5uMFMlgaJpZM4TfnY1 .

denyslazarenko commented 6 years ago

I trained model which was proposed in examples: CNNNet + ImplicitSequenceModel. Then I extracted embeddings from CNNNet and tried to measure similarity using cosine metric. However, results seem to be total nonsense. Did somebody try to do the same?

maciejkula commented 6 years ago

Is the actual model accurate? If the model is broken, the embeddings are probably meaningless.

denyslazarenko commented 6 years ago

@maciejkula

def evaluate_cnn_model(hyperparameters, train, test, validation, random_state):

    h = hyperparameters

    net = CNNNet(train.num_items,
                 embedding_dim=h['embedding_dim'],
                 kernel_width=h['kernel_width'],
                 dilation=h['dilation'],
                 num_layers=h['num_layers'],
                 nonlinearity=h['nonlinearity'],
                 residual_connections=h['residual'])

    print(net.item_embeddings.weight)

    model = ImplicitSequenceModel(loss=h['loss'],
                                  representation=net,
                                  batch_size=h['batch_size'],
                                  learning_rate=h['learning_rate'],
                                  l2=h['l2'],
                                  n_iter=h['n_iter'],
                                  use_cuda=CUDA,
                                  random_state=random_state)

    model.fit(train, verbose=True)

    test_mrr = sequence_mrr_score(model, test)
    val_mrr = sequence_mrr_score(model, validation)

    return net, model, test_mrr, val_mrr

I used the one which u proposed in the tutorial. And after fit, I get embeddings from net object.

rragundez commented 6 years ago

@maciejkula it seems this discussion went quite off-track to the issue. So coming to the issue of retrieving item similarities, this could done by adding a method to BilinearNet since you are already holding the item_representations there, let's say this method is _compute_item_similarities(item_ids, ..). Then in the actual model you would have another method like def item_similarities(items_ids) which calls self._net._compute_item_similarities(items_ids, ...) something like that.

The same could be done with the user representations for example to identify clusters of users, etc.

Thanks to your good code this shouldn't be too difficult to implement. I could take care of this if you find it useful