Raw dataset processing details

Details are as mentioned in the article, part of the data is from the original competition. We have separately processed two multimodal recommendation datasets, which will be released including visual posters/pictures, original textual information, and preprocessed interaction and feature data in the current pipeline format that can be directly used. The textual information in this article is inherent to the dataset itself, while the visual information is crawled from web pages. The feature data includes both regular extractors and a ChatGPT version. The new data containing the original modal information will be released in our future work. Please stay tuned for updates.

HKUDS / MMSSL

Raw dataset processing details #12