qlands / elasticfeeds

A Python library for managing feeds using ElasticSearch
Other
19 stars 4 forks source link
activity-feed elasticsearch notification-feed python python-library

CircleCI Codecov Black

ElasticFeeds

A python library to manage notification and activity feeds using Elasticsearch as back-end.

Description

Few years ago I started to work on FormShare, a platform built with Python and Pyramid that has Social Media features, and I had to get my hands into handling activity feeds. After searching the Internet for possible Python frameworks, I realized that those well maintained like Django Activity Stream or Stream Framework were very oriented to Django (which I hate). Furthermore, both frameworks use asynchronous tasks to perform “fan-out on write” operations which I think is an overkill if you consider a user like @katyperry with 107,805,373 followers.

Later, I encounter a post in Stack Overflow on "Creating a SOLR index for activity stream or news feed" which attached a presentation on "A news feed with ElasticSearch". The authors explain how to use Elasticsearch to create “fan-out on read” by “Storing atomic news and compose a news feed at the query time”.

After some trial and error, I managed to have feeds in Elasticsearch and perform fan-out on reads. Elasticsearch is incredible fast even with aggregation operations. The presentation on Elasticsearch talks about 40 milliseconds with 140 million of feeds with a 3 nodes. Elasticsearch is scalable which helps if you want to start small e.g., 1 node and progressively add more on demand.

Handling feeds in Elasticsearch and write aggregation queries is something that could discourage some Python programmers and that’s the reason for ElasticFeeds. ElasticFeeds encapsulates all these complexities allowing you to handle activity feeds with few lines of code while delegating all aggregation operations to Elasticsearch. The user only gets simple arrays of feeds as Python dictionaries.

Requirements

Usage

Collaborate

The way you manage feeds will depend on the kind of social platform you are implementing. While ElasticFeeds can store any kind of feeds and have some aggregator classes, the way you aggregate them would depend on how you want to present them to the end user.

Besides reporting issues, the best way to collaborate with ElasticFeeds is by sharing aggregator classes with others. So if you have an aggregator, fork the project, create a pull request and I will be happy to add it to the base code :-)