Questions and doubts about the dataset

Hi!

First and foremost, thanks for your contribution.

I'm using this dataset in my research; however, I'm having troubles to use the dataset after reading the SIGIR paper "RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender System" . I'm hoping you could answer the following questions:

Could you please explain me what is the meaning of the a_ and b_ prefixes in the data files? e.g., rl4rs_dataset_a_rl vs rl4rs_dataset_b_rl.
Could you please explain me what is the meaning of the _rl and _sl suffixes in the data files? e.g., rl4rs_dataset_a_rl vs rl4rs_dataset_a_sl.
Do users have a unique numerical identifiers? I tried doing a .unique() operation on the user_protrait column. However, I got way more unique strings than what is reported in Table 2.
Inside the item_feature column, how can I identify the item numerical identifier? The paper says that the ID is inside this column but does not specify its position inside the array.
If I want to perform an offline evaluation using a traditional user-rating matrix, can I join those datasets into a single matrix? or, instead, should I keep four different matrices (one for each data file)?
Could you please provide or highlight the code that computes the statistics of the dataset?
I'm trying to replicate Table 2 at the moment, however, I do not know how to map Slate-SL, Slate-RL, SeqSlate-SL, SeqSlate-RL to the data files.
Similar to 7, how can I create the Slate and SeqSlate datasets shown on the same Table?

Thanks in advance!

Thanks for your kind attention!

Please refer to Figure 3 in the referenced paper. The prefixes a and b indicate two different datasets, RL4RS-Slate and RL4RS-SeqSlate.
Please refer to Table 2 in the referenced paper. The suffix _rl means the separated data before RL deployment. The suffix _sl means the separated data after RL deployment.
No, users' unique numeric identifiers are not provided. Please use session_id instead.
Please refer to the exposed_items column.
I don't think so, as it is not designed for the user-rating matrix.
Please see the reproductions section of the README file.

7&8. Detailed instructions on how to create Slate and SeqSlate datasets can be found in the project's tutorial, accessible here: https://github.com/fuxiAIlab/RL4RS/blob/main/tutorial.ipynb

fuxiAIlab / RL4RS

Questions and doubts about the dataset #7