grahamjenson / list_of_recommender_systems

A List of Recommender Systems and Resources
4.6k stars 691 forks source link

List of Recommender Systems

Recommender systems (or recommendation engines) are useful and interesting pieces of software. I wanted to compare recommender systems to each other but could not find a decent list, so here is the one I created. Please help me keep this post up-to-date by submitting corrections and additions via pull-request, or tweet me @grahamjenson.

Software as a Service Recommender Systems

SaaS Recommender systems have many challenges to their development including having to handle multi-tenancy, store and process a massive amount of data and other softer concerns like keeping a clients sensitive data safe on remote servers.

The benefits to using a SaaS recommender system is that you can pay for value with a low overhead rather than having a large upfront investment, they generally have a clear integration path for you to use, and they provide continual development and improvement while you use it.

The SaaS recommender systems are:

  1. Peerius closed, product and e-commerce focused for live and email recommendations. Active and seems very interesting, although little information about the actual product and how it works is available.
  2. Strands is a closed, product and e-commerce focused system. I think it works by including tracking scripts (a la Google Analytics) on the website, and recommendations widgets. What I really like about Strands is their publishing of case-studies e.g. Wireless Emporium and white papers like The Big promise of recommender systems. Although these do not discuss the exact solutions provided, they give a good overview of their vision and goals of providing recommendations.
  3. SLI Systems Recommender A closed recommender system focused on e-commerce, search and mobile.
  4. Using Hadoop on Google Cloud an example use of Google cloud with benchmarks from recommender system.
  5. ParallelDots tool to relate published content
  6. Amazon Machine Learning machine learning platform to model data and create predictions
  7. Azure ML machine learning platform to model data and create predictions
  8. Gravity R&D is a company built by some of the winners from the 2009 Netflix prize. They offer a solution that provides targeted, customized recommendations to users of websites. They have some pretty big clients including DailyMotion and a technology page which describes their architecture, algorithms, and a list of publications. (suggested by Marton Vetes)
  9. Dressipi Style Adviser is a clothing-specific recommendation service. It incorporates both expert domain knowledge and machine learning to find outfits for occasions or moods.
  10. Sajari is a search, recommendation and matching (e.g. dating website) service. On their site, they also have aggregated a bunch of useful data-sets.
  11. IBM Watson is available through Watson Developer Cloud, which provides REST APIs (Watson APIs on Bluemix) and SDKs that use cognitive computing to solve complex problems.
  12. Recombee provides REST API, SDKs for multiple languages and graphical user interface for evaluating results. Main features are real time model updates, easy to use query language for filtering and boosting according to complex business rules and advanced features such as options for getting diverse or rotated recommendations. Recombee offers instant account with 100k free recommendation requests per month.
  13. Segmentify Recommendation Engine, Personalization and Real-Time Analytics tool.
  14. Mr. DLib A recommender-system as-a-service for academic organisations such as digital libraries and reference managers. Mr. DLib provides 'related-article' recommendations, is open-source, and publishes most of it's data.
  15. Rumo is a flexible SaaS recommendation system adaptable to all entertainment industries (films, music, podcasts, video games, sports, etc.) and based on both content metadata and user behaviors. Rumo's algorithms are transparent and explainable, providing full control over the recommendation process.
  16. Froomle is a modular recommendation platform, which focuses on serving news and e-commerce companies. They offer a variety of modules, optimized on their use case (f.e. discovery or related), business goal (f.e. CTR or conversion) and integration type (web, mail or push notifications). Their modules use state of the art machine learning techniques under the hood.
  17. Recommendations AI deliver highly personalized product recommendations at scale. It's a part of Google Cloud’s Discovery Solutions for Retail which provide personalized search and recommendations.

Open Source Recommender Systems

Most of the non-SaaS recommender systems that are open-source. This may have been because recommender systems are more tailored to clients so not easily made into a product.

The open-source recommender systems are:

  1. The Universal Recommender Is built on the modern Correlated Cross-Occurrence Algorithm that uses many indicators of user taste, and so can target most use cases. Source on github built-in to the Harness ML server or as a template for the older PredictionIO server (highest-rated template). Active and commercially supported.
  2. PredictionIO is built on technologies Apache Spark, Apache HBase and Spray. It is a machine learning server that can be used to create a recommender system. The source can be located on github. Main repository has been abandoned.
  3. Raccoon Recommendation Engine is an open source Node.js based collaborative filter that uses Redis as a store. It is effectively abandoned.
  4. HapiGER is an open source Node.js collaborative filtering engine, which can use in-memory, PostgreSQL or rethinkdb. Reasonably active development (when I have time :)
  5. EasyRec Java and Rest based recommendations. Abandoned
  6. Mahout Hadoop/linear algebra based data mining
  7. Seldon is a Java based prediction engine built on technologies like Apache Spark. It provides a demo movie recommendations application here.
  8. Oryx v2 a large scale architecture for machine learning and prediction (suggested by Lorand)
  9. RecDB is a PostgreSQL extension to add recommendation algorithms like collaborative filtering directly into the database.
  10. Crab a python recommender based on the popular packages NumPy, SciPy, matplotlib. The main repository seems to be abandoned.
  11. predictor is a ruby recommender gem. This uses Jaccard or Sorenson-Dice coefficient to priovide both item centric e.g. "Users that read this book also read ..." and user centric e.g. "You read these 10 books, so you might also like to read ..." recommendations. Looks a bit neglected.
  12. Surprise A Python scikit for building, and analyzing (collaborative-filtering) recommender systems. Various algorithms are built-in, with a focus on rating prediction.
  13. LightFM is an actively-developed Python implementation of a number of collaborative- and content-based learning-to-rank recommender algorithms. Using Cython, it easily scales up to very large datasets on multi-core machines and is used in production at a number of companies, including Lyst and Catalant.
  14. Rexy is an open-source recommendation system based on a general User-Product-Tag concept and a flexible structure that has been designed to be adaptable with variant data-schema. Rexy is written in Python-3.5 in a highly optimized, Pythonic and comprehensive way that makes it so flexible against the changes. It also used Aerospike as the database engine which is a high speed, scalable, and reliable NoSQL database.
  15. QMF is a fast and scalable C++ library for implicit-feedback matrix factorization models.
  16. tensorrec is a TensorFlow recommendation algorithm and framework in Python.
  17. hermes is a recommendation framework for collaborative-filtering and content-based algorithms in PySpark. Main repository has been abandoned.
  18. Spotlight utilizes factorization model and sequence model in the back end for building a basic recommendation system. It's a well-implemented Python framework.
  19. Implicit is a Fast Python Collaborative Filtering for Implicit Datasets. This project provides fast Python implementations of several different popular recommendation algorithms for implicit feedback datasets.
  20. recommenderlab provides a research infrastructure to test and develop recommender algorithms including UBCF, IBCF, FunkSVD and association rule-based algorithms.
  21. CaseRecommender is a Python implementation of a number of popular recommendation algorithms. The framework aims to provide a rich set of components from which you can construct a customized recommender system from a set of algorithms.
  22. ProbQA is a C++/CUDA recommender system that uses Bayesian approach to learning how answers to its questions map to best recommendations of a target being searched. On GitHub it's available with an example of learning the binary search algorithm. Its application to a video game recommendation system is available on the internet as a demo of the engine.
  23. Microsoft Recommenders contains examples, utilities and best practices for building recommendation systems. Implementations of several state-of-the-art algorithms are provided for self-study and customization in your own applications.
  24. Gorse is an offline recommender system backend based on collaborative filtering written in Go. It implements mutiple rated or ranked based recommenders and multiple tools ranging from import/export tools to RESTful recommender server.
  25. Nvidia Merlin is an end-to-end recommender-on-GPU ecosystem composed by many tools, like NVTabular for fast preprocessing / feature engineering and HugeCTR for high-throughput training and inference for large-scale CTR prediction. Transformers4Rec is also part of Merlin ecosystem, providing TF an PyTorch APIs for sequential and session-based recommendation leveraging contextual features and NLP architectures from HuggingFace Transformers library.
  26. Alibaba EasyRec is a python recommender system that implements state of the art deep learning models used in common recommendation tasks: candidate generation(matching), scoring(ranking), and multi-task learning. It improves the efficiency of generating high performance models by simple configuration and hyper parameter tuning(HPO).

Non-SaaS Product Recommender Systems

Not very many Non-SaaS Non-OpenSource recommender systems seem to exist. Below is a list:

  1. Dato is a company that provides a python package and servers for business machine learning including many predictive algorithms for recommendations. They also integrate with Apache Spark and have great blog posts like Why is building custom recommender systems hard? Does it have to be?. Their customers include Pandora and StumbleUpon, must be a good product.

Academic Recommender Systems

Recommender systems are a very active area of research in academia, though few of the generated systems make it out of the lab. Here are a few I have found that did:

  1. LensKit LensKit is a set of Python tools for experimenting with and studying recommender systems.
  2. Duine Framework a Java based recommendation system that has been abandoned
  3. MyMediaLite C# based in-memory recommender system that has been abandoned
  4. Bonus: List of Recommender System Dissertations, a useful list to keep up with the current state of recommendations systems in academia
  5. LibRec A Java based Recommendations engine with loads of implemented algorithms (suggested by Saúl Vargas)
  6. RankSys Java Recommendation system for novelty and diversity created by Saúl Vargas)
  7. LIBMF A Matrix-factorization Library for Recommender Systems
  8. proNet-core A general-purpose network embedding framework which provides several factorization-based models for recommender systems
  9. Devooght A repository containing collaborative-filtering algorithms based on sequences.
  10. GRU4Rec The original implementation of the algorithm proposed in Session-based Recommendations with Recurrent Neural Networks and its follow up in Recurrent Neural Networks with Top-k Gains for Session-based Recommendations
  11. Cornac A Python based comparative framework for multimodal recommender systems with a focus on models leveraging auxiliary data (developed by Preferred.AI).

Benchmarking Recommender Systems

It is very difficult to benchmark recommender systems, not only because getting good datasets is hard, but different methods and algorithms have different advantages and disadvantages that are difficult to expose.

Here is a list of some benchmarking tools:

  1. TagRec Tag Recommender Benchmarking Framework
  2. RiVaL an open source toolkit for recommender system evaluation. Some results are posted here.
  3. Idomaar is a reference framework for recommender algorithm testing. It is developed in the framework of the CrowdRec project.

Media Recommendation Applications

In addition to generic recommender systems, I decided to add a list of applications where recommendations are a core offering, specifically in the domain of media recommendations:

  1. Yeah, Nah Movie recommendations app based on GER
  2. Jinni Movie recommendations site
  3. Gyde Streaming media recommendations
  4. TasteKid movies, books, music recommendations. sent to me by thelinuxlich
  5. Gnoosic music based on bands. sent to me by thelinuxlich
  6. Pandora music recommendations based on likes and dislikes or songs
  7. Criticker Game and movie collaborative recs. suggested by ran88dom99
  8. movielens.org End user movie n book rec by lenskit people. suggested by ran88dom99
  9. MAL based only similar usersrec and rec suggested by ran88dom99
  10. NewsPortalUserInteractions A large dataset provided by globo.com for news recommendation suggested by guedes-joaofelipe
  11. ContentWise UX management solution for digital media entertainment. suggested by GiovanniPaoloGibilisco

Books

  1. Practical Recommender Systems by Kim Falk (Manning Publications). Chapter 1
  2. Recommender Systems Handbook by Ricci, F. et al.

Best Practices

  1. Recommenders examples and best practices for building recommendations systems by Microsoft.

    Common Datasets

Name Scene Tasks Information URL
Amazon Review Commerce Seq Rec/CF Rec This is a large crawl of product reviews from Amazon. Ratings: 82.83 million, Users: 20.98 million, Items: 9.35 million, Timespan: May 1996 - July 2014 link
Amazon-M2 Commerce Seq Rec/CF Rec A large dataset of anonymized user sessions with their interacted products collected from multiple language sources at Amazon. It includes 3,606,249 train sessions, 361,659 test sessions, and 1,410,675 products. link link-2
Steam Game Seq Rec/CF Rec Reviews represent a great opportunity to break down the satisfaction and dissatisfaction factors around games. Reviews: 7,793,069, Users: 2,567,538, Items: 15,474, Bundles: 615 link
MovieLens Movie General The dataset consists of 4 sub-datasets, which describe users' ratings to movies and free-text tagging activities from MovieLens, a movie recommendation service. link
Yelp Commerce General There are 6,990,280 reviews, 150,346 businesses, 200,100 pictures, 11 metropolitan areas, 908,915 tips by 1,987,897 users. Over 1.2 million business attributes like hours, parking, availability, etc. link
Douban Movie, Music, Book Seq Rec/CF Rec This dataset includes three domains, i.e., movie, music, and book, and different kinds of raw information, i.e., ratings, reviews, item details, user profiles, tags (labels), and date. link
MIND News General MIND contains about 160k English news articles and more than 15 million impression logs generated by 1 million users. Every news contains textual content including title, abstract, body, category, and entities. link
U-NEED Commerce Conversation Rec U-NEED consists of 7,698 fine-grained annotated pre-sales dialogues, 333,879 user behaviors, and 332,148 product knowledge tuples. link
PixelRec Short Video Seq Rec/CF Rec PixelRec is a large dataset of cover images collected from a short video recommender system, comprising approximately 200 million user image interactions, 30 million users, and 400,000 video cover images. The texts and other aggregated attributes of videos are also included. link
KuaiSAR Video Search and Rec KuaiSAR contains genuine search and recommendation behaviors of 25,877 users, 6,890,707 items, 453,667 queries, and 19,664,885 actions within a span of 19 days on the Kuaishou app link
Tenrec Video, Article General Tenrec is a large-scale benchmark dataset for recommendation systems. It contains around 5 million users and 140 million interactions. link

This link contains all the datsets regarding RecSys - link