Open manisnesan opened 2 years ago
Study Group - Recsys Series Date: 2022-06-12
RecSys project that covers the bare minimum (retrieval/filtering/scoring/ordering) in some operational toy example. Like the Toy Machine Learning Project by Shreya Shankar (https://github.com/shreyashankar/toy-ml-pipeline)
Public dataset for RecSys algorithms (Spotify Million Playlist Dataset https://www.aicrowd.com/challenges/spotify-million-playlist-dataset-challenge)
Competitions
real-world (data included) mlops projects with real-world tools , you can check out https://github.com/jacopotagliabue/you-dont-need-a-bigger-boat
or the simpler https://github.com/jacopotagliabue/post-modern-stack
.
For public datasets, datasets required depending on the use cases: session recs? - https://github.com/coveooss/SIGIR-ecom-data-challenge Item to item? User to item?
RecSys Intro from Google Developers
Why recsys needed? Easier content browsing - helps users find content that they did not thought about asking
Components of recsys Candidate generation - Scoring - Reranking
CG - huge corpus in billions to smaller ones hundreds | 1000s - fast Score - precise model that scores and ranks to get the top 10, can use additional queries Rerank - rerank the top 10 based on filtering and boosting - diversity, freshness, fairness
Refer Deep NN for YouTube ranking