Open mszep opened 6 years ago
Yes, this sounds fine! May I ask why you use the factorization model, where the sequence model would probably give much better results and be faster to train?
To be honest, it's mostly limited knowledge on my part -- I'm more familiar with factorization models from previous work and thought I'd start there. We have had challenges with training on our full set, but have managed so far. 2-3 hours of training seems to yield a model with sensible results.
Would you say that sequence models are a better way to go in general, and that the factorization model should only be used if the order of the implicit feedback samples is not given?
I strongly recommend using a GPU for sequence models if you aren't already. The PyTorch CPU implementation is quite slow.
I think that sequence models are definitely a better way to go. They are more flexible and allow you to easily incorporate new interactions as they come in, without retraining your model.
Hi @maciejkula, thanks for this great project and for LightFM too. I have a couple of questions connected to this discussion.
First, as I understand it one of the benefits of matrix factorisation is that it can also be applied to the implicit interaction matrix's transpose, i.e. to the problem of recommending users to items (rather than items to users). For example, we might want to find the top N users who would be interested in an item who haven't interacted with it before. Is that correct and are you aware of a way to use sequence models to solve this problem?
Second, you mention sequence models allow to you incorporate new interactions as they come in. Just to clarify, do you mean simply that the model can make predictions for new sequences without needing to be retrained?
Hi @mjwestcott
Hi @maciejkula,
I am using CNNNet and my aim - obtain the embedding of items. After that, I would like to measure the similarity between items.
Now I am a little bit confused, I have the following code
CNNNet(
(item_embeddings): ScaledEmbedding(58129, 32, padding_idx=0)
(item_biases): ZeroEmbedding(58129, 1, padding_idx=0)
(cnn_0): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1))
(cnn_1): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(2, 1))
(cnn_2): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(4, 1))
(cnn_3): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(8, 1))
(cnn_4): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(16, 1))
(cnn_5): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(32, 1))
(cnn_6): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(64, 1))
(cnn_7): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(128, 1))
(cnn_8): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1))
(cnn_9): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(2, 1))
)
Now i would like to make several iterations:
for _ in range(10):
net.forward(train)
forward requires fpllowing Compute predictions for target items given user representations
-> therefore I would like to call method user_representation
which requires item_sequences
.
I uploadet a dataset <Sequence interactions dataset (209 sequences x 200 sequence length)>
and would like to pass this object into user_representation
, however I got error: AttributeError: 'SequenceInteractions' object has no attribute 'contiguous'
.
What I am going wrong? Maybe you have the code example of the idea which i would like to implement, because it seems to be a common need.
The forward function accepts tensors rather than the Interactions objects. Have a look at how the Spotlight models do it.
On Tue, 31 Jul 2018, 05:52 Denys Lazarenko, notifications@github.com wrote:
Hi @maciejkula https://github.com/maciejkula,
I am using CNNNet and my aim - obtain the embedding of items. After that, I would like to measure the similarity between items. Now I am a little bit confused, I have the following code
CNNNet( (item_embeddings): ScaledEmbedding(58129, 32, padding_idx=0) (item_biases): ZeroEmbedding(58129, 1, padding_idx=0) (cnn_0): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1)) (cnn_1): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(2, 1)) (cnn_2): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(4, 1)) (cnn_3): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(8, 1)) (cnn_4): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(16, 1)) (cnn_5): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(32, 1)) (cnn_6): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(64, 1)) (cnn_7): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(128, 1)) (cnn_8): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1)) (cnn_9): Conv2d(32, 32, kernel_size=(7, 1), stride=(1, 1), dilation=(2, 1)) )
Now i would like to make several iterations:
for _ in range(10): net.forward(train)
forward requires fpllowing Compute predictions for target items given user representations -> therefore I would like to call method user_representation which requires item_sequences. I uploadet a dataset <Sequence interactions dataset (209 sequences x 200 sequence length)> and would like to pass this object into user_representation, however I got error: AttributeError: 'SequenceInteractions' object has no attribute 'contiguous'. What I am going wrong? Maybe you have the code example of the idea which i would like to implement, because it seems to be a common need.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/maciejkula/spotlight/issues/108#issuecomment-409209939, or mute the thread https://github.com/notifications/unsubscribe-auth/ACSCA5uEbBeqrJu2EFfBXttdEilKM-8iks5uMFMlgaJpZM4TfnY1 .
I trained model which was proposed in examples: CNNNet + ImplicitSequenceModel. Then I extracted embeddings from CNNNet and tried to measure similarity using cosine metric. However, results seem to be total nonsense. Did somebody try to do the same?
Is the actual model accurate? If the model is broken, the embeddings are probably meaningless.
@maciejkula
def evaluate_cnn_model(hyperparameters, train, test, validation, random_state):
h = hyperparameters
net = CNNNet(train.num_items,
embedding_dim=h['embedding_dim'],
kernel_width=h['kernel_width'],
dilation=h['dilation'],
num_layers=h['num_layers'],
nonlinearity=h['nonlinearity'],
residual_connections=h['residual'])
print(net.item_embeddings.weight)
model = ImplicitSequenceModel(loss=h['loss'],
representation=net,
batch_size=h['batch_size'],
learning_rate=h['learning_rate'],
l2=h['l2'],
n_iter=h['n_iter'],
use_cuda=CUDA,
random_state=random_state)
model.fit(train, verbose=True)
test_mrr = sequence_mrr_score(model, test)
val_mrr = sequence_mrr_score(model, validation)
return net, model, test_mrr, val_mrr
I used the one which u proposed in the tutorial. And after fit, I get embeddings from net
object.
@maciejkula it seems this discussion went quite off-track to the issue. So coming to the issue of retrieving item similarities, this could done by adding a method to BilinearNet
since you are already holding the item_representations
there, let's say this method is _compute_item_similarities(item_ids, ..)
. Then in the actual model you would have another method like def item_similarities(items_ids)
which calls self._net._compute_item_similarities(items_ids, ...)
something like that.
The same could be done with the user representations for example to identify clusters of users, etc.
Thanks to your good code this shouldn't be too difficult to implement. I could take care of this if you find it useful
I'm trying to implement both recommendations and item-item similarities based on an
ImplicitFactorizationModel
trained on clickstream data. The former is trivial using thepredict
method, but the latter I'm not quite sure about. Looking at the code, I could obtain the item embeddings in a similar way to how it's done in_components.py
andrepresentations.py
, and then apply a similarity measure to the resulting torch tensors.Is this a sensible approach / Is there a better way to do this / Is it even possible?