dartmouth-cs98-23f / project-short-learning

project-short-learning created by GitHub Classroom
0 stars 0 forks source link

Candidate Generation #135

Open linkevin281 opened 7 months ago

linkevin281 commented 7 months ago

Candidate Generation (14+ Hours)

General Youtube DNN for Youtube

Background

Modern recommendation engines have two parts: (1) Candidate Generation, (2) Ranking. Candidate generation typically involves a hyperfast DL model to break the entire video corpus down into ~1000 videos. Ranking will rank the top n and output k videos to the engine output.

Sometimes, the first stage of candidate generation (ex. TikTok ads infra) will rely on basic demographics like age and location that are instantly indexable. For our use case, and for non-ads recommendations, we likely needs something a little more specific, but not as powerful as a full fledged DL model.

If our database has 200 some videos, we should aim Candidate Generation to break the recommendations down into ~10 ish options. Ranker will take it past there. This may involve the top n topics of the user and finding the nearest 10 complexities not yet viewed.

Integration

Requirements