NVIDIA-Merlin / Transformers4Rec

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation and works with PyTorch.
https://nvidia-merlin.github.io/Transformers4Rec/main
Apache License 2.0
1.07k stars 142 forks source link

[QST] How to extract hidden state and viz attention layers #733

Closed vivpra89 closed 11 months ago

vivpra89 commented 1 year ago

❓ Questions & Help

Details

1.I want to extract the last dense layer embeddings for every user (one row) and pass as pre-trained embeddings to downstream tasks. (Should i pass a user_id as input and extract those embeddings? or get the hidden state .. which of these two represent the user better?

  1. Also want to plot attention layers to viz how attention changes with position of item in sequences. as in bert4rec paper : https://arxiv.org/pdf/1904.06690.pdf
image

Tried doing this to get hidden state:

image

@gabrielspmoreira @rnyak any thoughts appreciated!

NamartaVij commented 1 year ago

@vivpra89 could you please share you email id so that we discuss this model in detail then we can post our doubts here to get quick replies

rnyak commented 1 year ago

@NamartaVij We are not able to return “output_attention_weights" from the TF4Rec model. This is not implemented. What we can return is to extract hidden state embeddings out of the Trainer module via model.heads[0].body(batch[0]) which will return a 3D array size of (batch_size, sequence_length, d_model).

hope that helps.

rnyak commented 11 months ago

@NamartaVij I am closing this ticket since there is no recent activity.