zuoxingdong commented 2 years ago

❓ Questions & Help

Existing examples in session-based/sequential recommendations only use item-level, sequence-based features. However, in many real-world scenarios, we do have access to either user/session features (e.g. user demographics or session contextual information)

For item-level sequence features, one uses tr.TabularSequenceFeatures

I am curious about how to combine tr.TabularSequenceFeatures with user-/session- level features tr.TabularFeatures and feed to PredictionTask via an aggregator.

Details

To be more specific, taking this notebook as an example

https://github.com/NVIDIA-Merlin/Transformers4Rec/blob/main/examples/tutorial/03-Session-based-recsys.ipynb

In cell 5, we define an input module for item-level sequence.

inputs = tr.TabularSequenceFeatures.from_schema(
        schema,
        max_sequence_length= sequence_length,
        masking = 'causal',
    )

and then define a GRU model to process sequences

body = tr.SequentialBlock(
        inputs,
        tr.MLPBlock([d_model]),
        tr.Block(torch.nn.GRU(input_size=d_model, hidden_size=d_model, num_layers=1), [None, 20, d_model])
)

However, if we have various user/session level features, say processed by NVT with column name *-first, e.g. age-first, gender-first, ...

Q: How should I integrate these features into the model above?

What I expected to have:

seq_inputs = tr.TabularSequenceFeatures.from_schema(
        schema,
        max_sequence_length= sequence_length,
        masking = 'causal',
  )
seq_body = tr.SequentialBlock(
        seq_inputs,
        tr.MLPBlock([d_model]),
        tr.Block(torch.nn.GRU(input_size=d_model, hidden_size=d_model, num_layers=1), [None, 20, d_model])
)

context_inputs = tr.TabularFeatures.from_schema(
        schema_context
  )
context_body = tr.SequentialBlock(
        context_inputs,
        tr.MLPBlock([d_model]),
)
body = tr.BlockAggregator([seq_body, context_body])

sararb commented 1 year ago

Thank you @zuoxingdong for your question!

I can think of two different methods to support context features:

Case 1: Repeat the context feature for each position in the input sequence, then feed the aggregated 3-D vector to the sequential module (GRU in your example).
Case 2 [your example]: Define two separate towers for sequence and context features (seq_body and context_body), then aggregate their outputs and feed the resulting tensor to the prediction layer. In that case, the seq_body should return a 2-D vector summarizing the information of the whole sequence.

Case 1 is already supported by the TabularSequenceFeatures module. The module automatically detects the contexts features and expands them to 3-D tensors by repeating the same value across all sequence positions. Case 2 will require additional custom code to convert the sequence vectors to a 2-D representation and to aggregate the two tower's outputs.

ralgond commented 1 year ago

@sararb Could you please show some codes for this question? Thank you.

karunaahuja commented 8 months ago

Case 2 will require additional custom code to convert the sequence vectors to a 2-D representation and to aggregate the two tower's outputs. @sararb Do you have an example of how to implement this ?

NVIDIA-Merlin / Transformers4Rec

[QST] How to use session-level (single) and item-level (sequence) features together in next item prediction task? #556

❓ Questions & Help

Details