Closed arseny239 closed 4 months ago
I did some additional research and I noticed that this error appears only if I have some float-type features in the 'selected_features' list.
If I put only 'token' and/or 'token_seq' features there, the model starts to train ok.
If I put only 'float' feature(s) in the 'selected_features' list, I get another error:
RuntimeError: torch.cat(): expected a non-empty list of Tensors
The errors you're encountering with the SASRecF model in RecBole seem to stem from the way item features are being processed and integrated into the model. SASRecF is designed to concatenate item representations with item attribute representations as inputs to the model. This process involves several hyperparameters and configurations that must align with the structure and content of your dataset.
The RuntimeError: mat1 and mat2 shapes cannot be multiplied
suggests a mismatch in dimensions between the data provided to the model and the model's expected input sizes. When you change the selected_features
list, the dimensions of the inputs change, hence the variation in error messages.
Key Points from the Documentation:
hidden_size
, n_layers
, n_heads
, inner_size
, and selected_features
. The hidden_size
hyperparameter, for example, defines the number of features in the hidden state and also serves as the initial embedding size of items, which by default is 64
.selected_features
parameter controls which item context information is used. This parameter must include names that match fields in your dataset. The documentation highlights the importance of ensuring that these features must be present in the dataset and properly loaded by the data module in RecBole.forward
function in the model's source code outlines how item embeddings and feature embeddings (both sparse and dense) are concatenated and processed. This includes applying a linear transformation (concat_layer
) to match the expected dimensionality for further processing within the model.Addressing Your Issue:
hidden_size
) and the number of selected features (selected_features
) directly influence the input dimensionality to the model. Adjusting these parameters affects the model's ability to process your data correctly.RuntimeError: torch.cat(): expected a non-empty list of Tensors
) indicates that there might be an issue with how these features are being processed or concatenated. It's crucial to ensure that float features are correctly represented and included in the feature embeddings.selected_features
list, and considering the error doesn't occur with only token-type features, it's possible that the handling or representation of float-type features in the feature embeddings might be the root cause of the issue.Suggested Steps:
token
, float
).hidden_size
and inner_size
parameters based on the dimensions of your input data and the number of features you're including.For more detailed insights into configuring and running SASRecF, including hyperparameter settings and model usage, please refer to the official documentation provided by RecBole.https://www.recbole.io/docs/user_guide/model/sequential/sasrecf.html
Hi,
First of all let me thank you for such a great library. I really like it! But I still have some problem(s):
Describe the bug I tried to train and use some the SASRecF model. Before that, I tried the SASRec on the same data and it worked well, but I want to consider the item's features also.
But when I try to train it, I get an error: RuntimeError: mat1 and mat2 shapes cannot be multiplied (10x640 and 768x64)"
if I remove one feature from the 'selected_features' list it changes a little: RuntimeError: mat1 and mat2 shapes cannot be multiplied (10x576 and 704x64)
But if I leave only 1 feature in the 'selected_features' list, it starts to train ok:
'selected_features': ['name'],
My item's features are:
and their types are:
I use run_recbole to train the model:
run_recbole(model='SASRecF', dataset=DATASET_NAME, config_dict=parameter_dict)
At the same time, some other model(s), such as SASRec, works well.Am I doing something wrong or is it a bug in the model? What reasons could cause this behavior?
I work with my own dataset, I created the .item and .inter "atomic files" (no .user file because I do not have any info about users - only id's)
I use recbole version 1.2.0 and Linux (Debian), without GPU (I train it on the CPU with 16 cores)
The full text of the error:
and also my parameter_dict:
Thank you for the answer(s). Sincerely yours, Arseny