PyTorch implementation for WWW 2023 paper Multi-Modal Self-Supervised Learning for Recommendation.
MMSSL is a new multimedia recommender system which integrates the generative modality-aware collaborative self-augmentation and the contrastive cross-modality dependency encoding. It achieves better performance than existing SOTA multi-model recommenders.
Start training and inference as:
cd MMSSL
python ./main.py --dataset {DATASET}
Supported datasets: Amazon-Baby
, Amazon-Sports
, Tiktok
, Allrecipes
├─ MMSSL/
├── data/
├── tiktok/
...
Dataset | Amazon | Tiktok | Allrecipes | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Modality | V | T | V | T | V | A | T | V | T | ||||
Embed Dim | 4096 | 1024 | 4096 | 1024 | 128 | 128 | 768 | 2048 | 20 | ||||
User | 35598 | 19445 | 9319 | 19805 | |||||||||
Item | 18357 | 7050 | 6710 | 10067 | |||||||||
Interactions | 256308 | 139110 | 59541 | 58922 | |||||||||
Sparsity | 99.961\% | 99.899\% | 99.904\% | 99.970\% |
2024.3.20 baselines LLATTICE and MICRO uploaded
: 📢📢📢📢🌹🔥🔥🚀🚀 Because baselines LATTICE
and MICRO
require some minor modifications, we provide code that can be easily run by simply modifying the dataset path.
2023.11.1 new multi-modal datastes uploaded
: 📢📢🔥🔥🌹🌹🌹🌹 We provide new multi-modal datasets Netflix
and MovieLens
(i.e., CF training data, multi-modal data including item text
and posters
) of new multi-modal work LLMRec on Google Drive. 🌹We hope to contribute to our community and facilitate your research~
2023.3.23 update(all datasets uploaded)
: We provide the processed data at Google Drive.
2023.3.24 update
: The official website of the Tiktok
dataset has been closed. Thus, we also provide many other versions of preprocessed Tiktok. We spent a lot of time pre-processing this dataset, so if you want to use our preprocessed Tiktok in your work please cite.
🚀🚀 The provided dataset is compatible with multi-modal recommender models such as MMSSL, LATTICE, and MICRO and requires no additional data preprocessing, including (1) basic user-item interactions and (2) multi-modal features.
🌹🌹 Please cite our paper if you use the 'netflix' dataset~ ❤️
We collected a multi-modal dataset using the original Netflix Prize Data released on the Kaggle website. The data format is directly compatible with state-of-the-art multi-modal recommendation models like LLMRec, MMSSL, LATTICE, MICRO, and others, without requiring any additional data preprocessing.
Textual Modality:
We have released the item information curated from the original dataset in the "item_attribute.csv" file. Additionally, we have incorporated textual information enhanced by LLM into the "augmented_item_attribute_agg.csv" file. (The following three images represent (1) information about Netflix as described on the Kaggle website, (2) textual information from the original Netflix Prize Data, and (3) textual information augmented by LLMs.)
Visual Modality:
We have released the visual information obtained from web crawling in the "Netflix_Posters" folder. (The following image displays the poster acquired by web crawling using item information from the Netflix Prize Data.)