spotify-research / cosernn

Code for the paper "Contextual and Sequential User Embeddings for Large-Scale Music Recommendation".
Apache License 2.0
33 stars 8 forks source link

User Min Count Question #3

Closed mustfkeskin closed 2 years ago

mustfkeskin commented 2 years ago

Hello

In this paper mention about user activity counts. "On average, users in the dataset have 220 sessions during the two-month period, and each session consists of 10 tracks on average."

Is there a reason why there are at least 10 tracks per session or active users with 220 sessions are selected. How were these values chosen?

lucasmaystre commented 2 years ago

Hi @mustfkeskin , when building our dataset we sampled a set of users that satisfied some activity constraint. I don't recall precisely what the constraint was, but the idea was to avoid having to deal with cold-start situations (these are important but outside of the scope of the paper)

The numbers we report in the paper (220 sessions on average, 10 tracks per session on average) are just empirical averages we observed in the dataset. We didn't explicitly choose them, but they might be linked to the activity criterion we used to sample users.

mustfkeskin commented 2 years ago

Thanks for the reply. Is there an article about how to analyze this topic?

lucasmaystre commented 2 years ago

You're welcome! I'm not sure what paper to recommend here. You could try to search for papers on the cold-start problem in recommender systems, I think that's very related.