tschellenbach / Stream-Framework

Stream Framework is a Python library, which allows you to build news feed, activity streams and notification systems using Cassandra and/or Redis. The authors of Stream-Framework also provide a cloud service for feed technology:
https://getstream.io/
Other
4.73k stars 541 forks source link

Duplicated Activities. #153

Open deanblacc opened 9 years ago

deanblacc commented 9 years ago

Hello, Im finding that after marking activities as read and seen using the mark_all method that a new activity with all the same attributes has been added to the aggregated feed. The only difference being that the new one has seen_at and read_at set to None.

As a result, it looks like the user always have a new notification. Any idea under what circumstances this might occur ?

Cheers.

Atorich commented 8 years ago

+1

P.S. Using redis storage, RedisNotificationFeed

Atorich commented 8 years ago

I have made some code research and it seems that https://github.com/tschellenbach/Stream-Framework/blob/master/stream_framework/feeds/aggregated_feed/base.py#L227

this code doesn't work right way. There is no such activity in the storage that presents in to_remove list.

Atorich commented 8 years ago

I've found that this problem caused by activity serialization issue: the same Activity may be serialized/deserialized/serialized again and serialized(1) will not be equals serialized(2).

This simple test demonstates it:

# -*- coding: utf-8 -*-
import itertools
from app.event_feed.manager import get_notification_manager

manager = get_notification_manager(user_id=1)

# there is some activites
activites = manager.all()

storage = manager.feed.timeline_storage
serialize = storage.serializer.dumps
deserialize = storage.serializer.loads

activity = activites[0]
serialized = []

for i in xrange(0, 3):
    v = serialize(activity)
    activity = deserialize(v)
    serialized.append(v)

combinations = itertools.combinations(serialized, 2)

for v1, v2 in combinations:
    assert v1 == v2, "Not equals: \n%s\n%s" % ([v1], [v2])  # using lists to avoid line breaks in serialized data

Result:

AssertionError: Not equals: 
["v32-14480368794430000000150002;;1448036879.443569;;1448036879.443569;;1448040340.245328;;-1;;0,2,150,1,1448036879.443569,(dp0\nS'post'\np1\n(dp2\nVstatus\np3\nI3\nsVupdated_date\np4\nV20.11.2015\np5\nsVdeleted_date\np6\nNsVpublished_date\np7\nV20.11.2015\np8\nsVpublished_time\np9\nV16:27\np10\nsVweight\np11\nF0.0848528137\nsVauthor\np12\n(dp13\nVusername\np14\nVadmin\np15\nsVid\np16\nI1\nsVemail\np17\nVadmin@admin.ru\np18\nssVdeleted\np19\nNsVtext\np20\nV<p>asfasdfasdf</p>\np21\nsVtitle\np22\nVasdfasdfdasf\np23\nsVupdated_time\np24\nV16:27\np25\nsVrate\np26\nF0.0\nsVdeleted_time\np27\nNsVcreated_time\np28\nV16:15\np29\nsVauthor_id\np30\nI1\nsVid\np31\nI604\nsVtags\np32\n(lp33\nVAsfdasf\np34\nasVcreated_date\np35\nV12.11.2015\np36\nssS'earnings'\np37\nF7.0\nsS'post_title'\np38\ng23\ns.;;0"]
["v32-14480368794430000000150002;;1448036879.443569;;1448036879.443569;;1448040340.245328;;-1;;0,2,150,1,1448036879.443569,(dp0\nS'post'\np1\n(dp2\nVstatus\np3\nI3\nsVupdated_date\np4\nV20.11.2015\np5\nsVpublished_time\np6\nV16:27\np7\nsVweight\np8\nF0.0848528137\nsVauthor\np9\n(dp10\nVusername\np11\nVadmin\np12\nsVid\np13\nI1\nsVemail\np14\nVadmin@admin.ru\np15\nssVdeleted\np16\nNsVtext\np17\nV<p>asfasdfasdf</p>\np18\nsVid\np19\nI604\nsVtitle\np20\nVasdfasdfdasf\np21\nsVupdated_time\np22\nV16:27\np23\nsVrate\np24\nF0.0\nsVdeleted_time\np25\nNsVpublished_date\np26\nV20.11.2015\np27\nsVcreated_time\np28\nV16:15\np29\nsVdeleted_date\np30\nNsVauthor_id\np31\nI1\nsVtags\np32\n(lp33\nVAsfdasf\np34\nasVcreated_date\np35\nV12.11.2015\np36\nssS'earnings'\np37\nF7.0\nsS'post_title'\np38\ng21\ns.;;0"]

As you can see above, there is the difference in serialized fields order. So the problem is in pickle-ing extra_context (dict fields ordering). Example: http://stackoverflow.com/questions/23069908/pickle-order-mystery

Atorich commented 8 years ago

Oh, it references #53