simonebbruun / RS_multi_modal_user_interactions

4 stars 0 forks source link

RS_multi_modal_user_interactions

This repository contains the data and source code for Dataset and Models for Item Recommendation Using Multi-Modal User Interactions.

Requirements

Dataset

We publish a real-world dataset from the insurance domain with multi-modal user interactions that can be used in recommendation models. The dataset is anonymized.
Download the files: data_users.csv, data_conversations_keyword.csv, data_sessions.csv, data_purchase_events.csv, data_post_filter.csv
and the folder: data_conversations_embedding

Dataset Format

There are 6 different datasets.

data_users.csv

This data contains the users. Each user has had one or more purchase events with conversations and/or web sessions prior to that purchase. The data contains 5 columns:

data_conversations_keyword.csv

This data contains the conversations that the user had prior to the user's purchase event. Each conversation consists of multiple sentences represented with keywords. The data contains 4 columns:

data_conversationsembedding(1-107).csv

This data is split into multiple files due to file size limitations. The data contains the conversations that the user had prior to the user's purchase event. Each conversation consists of multiple sentences represented with text embeddings. The data contains 771 columns:

data_sessions.csv

This data contains the web sessions that the user made prior to the user's purchase event. Each web session consists of multiple actions. The data contains 3 columns:

data_purchase_events.csv

This data contains the purchase events. Each event consists of one or more item purchases made by the same user. The data contains 2 columns:

data_post_filter.csv

This data contains the items that were possible for the user to buy at the time of the user's purchase event. The data contains 2 columns:

Usage

  1. Train and validate the models using
    model_popular.py
    model_conversation.py
    model_session.py
    model_knowledge_distillation.py
    model_generative_imputation_step_1.py
    model_generative_imputation_step_2.py
    model_generative_imputation_step_3.py
    model_neutral_imputation.py
    model_keyword.py
    model_latent_feature.py
    model_relative_representation_step_1.py
    model_relative_representation_step_2.py
    model_relative_representation_step_3.py
  2. Evaluate the models over the test set using
    evaluation_popular.py
    evaluation_conversation.py
    evaluation_session.py
    evaluation_late_fusion.py
    evaluation_knowledge_distillation.py
    evaluation_generative_imputation.py
    evaluation_neutral_imputation.py
    evaluation_keyword.py
    evaluation_latent_feature.py
    evaluation_relative_representation.py