ssbc / ssb-replication-scheduler

Plugin to trigger replication of feeds identified as friendly in the social graph
8 stars 1 forks source link

Debounce friend stream #3

Closed arj03 closed 2 years ago

arj03 commented 3 years ago

When adding a lot of messages for a feed, right now you might get a follow and a unfollow or block right after. Would be good if these were debounced a bit so that ebt doesn't start requesting these messages.

staltz commented 3 years ago

I thought about using pull-debounce or pull-throttle, but this issue is a bit more complicated and may require a new pull-stream operator. We can't debounce nor throttle because we can't skip any stream emissions of graphStream or hopStream, we need to keep them all. What we need to do is buffer these emissions and accumulate them (in pull-scan-ish style) as a reduced state of the graph, and then throttle those reduced states. The buffer should be cleared when the throttled result is sent out.

arj03 commented 3 years ago

In a way it's more like a throttle + reduce kind of thing that is very specific to how friends work. I'll try adding a delay function to the priority queue, so not that much of a biggie for now I think. For me at least :)

staltz commented 2 years ago

Oh, look at this old issue, it's basically the same as https://github.com/ssbc/ssb-friends-purge/issues/1 :sweat_smile:

arj03 commented 2 years ago

:D

staltz commented 2 years ago

More weird reports for this issue:

I noticed that my Manyverse was busy replicating or indexing or something, and then I went to check my Storage screen, and it shows deleted records at a total of 91.9 MB. I scrolled through the accounts, and there are several blocked accounts taking a lot of space from my device (in the form of zeroed-out records, i.e. uncompacted). However, I'm sure that I've been at 91.9MB before AND that I have compacted many times.

What's different this time is that I think nothing changed in ssb-friends, the social graph remained the same. So I think it's not a problem with my social graph temporarily showing that I followed an account long ago which today I block.

Maybe it is after all some ssb-ebt bug, but this could only mean that other peers are sending me feeds and ignoring that I don't want those feeds.

QUESTION: if I block Bob privately (i.e. box1-encrypted private contact message with only me as recps), do other EBT peers ignore that and just keep sending me Bob's feed? (Assume that I always first publicly unfollow Bob before privately blocking him) Because I certainly have done a number of private blocks on my device.

arj03 commented 2 years ago

That is one of the things I tested (that after you reconnect, the other end does not know your blocks so will try and tell you about content for feeds that you are blocking).

First:

ok 1 alice:@OAiOTCroL1xFxoCKYaZJDTxhLOHaI1cURm/HSPvEy7s=.ed25519
ok 2 bob:@3XVgQP3VFF4r+YMtJk3QgOSz5zAWvfZXS0zYfqppf14=.ed25519
ok 3 dolores:@N5FxwG9h3idxDU20DCq4FlUir4iH7/jU9aD9mpmKWeU=.ed25519

Later on bob posts a new message while alice is disconnected:

EBT:recv json (@N5FxwG9h3idxDU20DCq4FlUir4iH7/jU9aD9mpmKWeU=.ed25519) {
  "previous": "%IRCkWs82SdMSJKGL1nm7+/CKllSt0bZaO3bj07ERum8=.sha256",
  "sequence": 3,
  "author": "@3XVgQP3VFF4r+YMtJk3QgOSz5zAWvfZXS0zYfqppf14=.ed25519",
  "timestamp": 1661280524989,
  "hash": "sha256",
  "content": {
    "type": "post",
    "text": "hello B2"
  },
  "signature": "Lc2gR+5e3P3+ukZQIzlVyazXrQlF3LZ4Bw3Z1r7MFLzD0oYG7NnNzR/HGRF1j6dUa20lrQM8cb6YyNkBJBcVAQ==.sig.ed25519"
}
ok 9 reset indexes
EBT:conn @OAiOTCroL1xFxoCKYaZJDTxhLOHaI1cURm/HSPvEy7s=.ed25519
EBT:conn @N5FxwG9h3idxDU20DCq4FlUir4iH7/jU9aD9mpmKWeU=.ed25519
....
EBT:send notes (@N5FxwG9h3idxDU20DCq4FlUir4iH7/jU9aD9mpmKWeU=.ed25519) {
  '@N5FxwG9h3idxDU20DCq4FlUir4iH7/jU9aD9mpmKWeU=.ed25519': { seq: 7, sequence: 3, rx: false },
  '@3XVgQP3VFF4r+YMtJk3QgOSz5zAWvfZXS0zYfqppf14=.ed25519': { seq: 7, sequence: 3, rx: false },
  '@OAiOTCroL1xFxoCKYaZJDTxhLOHaI1cURm/HSPvEy7s=.ed25519': { seq: 7, sequence: 3, rx: false }
}
EBT:recv json (@OAiOTCroL1xFxoCKYaZJDTxhLOHaI1cURm/HSPvEy7s=.ed25519) {
  "@N5FxwG9h3idxDU20DCq4FlUir4iH7/jU9aD9mpmKWeU=.ed25519": 7,
  "@3XVgQP3VFF4r+YMtJk3QgOSz5zAWvfZXS0zYfqppf14=.ed25519": 7,
  "@OAiOTCroL1xFxoCKYaZJDTxhLOHaI1cURm/HSPvEy7s=.ed25519": 7
}
EBT:send notes (@OAiOTCroL1xFxoCKYaZJDTxhLOHaI1cURm/HSPvEy7s=.ed25519) {
  '@N5FxwG9h3idxDU20DCq4FlUir4iH7/jU9aD9mpmKWeU=.ed25519': { seq: 7, sequence: 3, rx: false },
  '@3XVgQP3VFF4r+YMtJk3QgOSz5zAWvfZXS0zYfqppf14=.ed25519': { seq: -1, sequence: -1, rx: false },
  '@OAiOTCroL1xFxoCKYaZJDTxhLOHaI1cURm/HSPvEy7s=.ed25519': { seq: 8, sequence: 4, rx: true }
}

You can see that when dolores connects with alice (after block, delete and compact) she tells alice that there is new content for bob. But alice responds with a -1, so dolores does not send the new hello message (or any earlier messages from bob).

I updated the tests so that the block is now private. The behavior is the same.

If you enable EBT logging, then we should be able to see: if the problem is in EBT and if so, what happens. Who is sending you the blocked content.

staltz commented 2 years ago

I changed something in ssb-friends-purge and I can't reproduce this bug anymore. Basically, the same kind of pause-resume with getIndexingActive() we do in ssb-replication-scheduler, I put that also in ssb-friends-purge, just to make sure that during reindexing we don't delete feeds that we actually would like to replicate (this would be the case: I used to be enemies with Bob but then we made peace).

However, I think there's another bug somewhere, probably in ssb-friends:

The social graph doesn't seem to be calculated correctly. I have feedlessPub's data and can browse their account like normal on Manyverse, even though I have published a private contact msg blocking them. If I look into ssb-friends, it tells me that feedlessPub is at hops 2 from me, when it should be -1. And, if I call friends.isBlocking, it answers false.

So it might be that this whole issue was a combination of many issues.

staltz commented 2 years ago

The social graph doesn't seem to be calculated correctly.

It's also possible that forcing reindex via ssb.db.reset(cb) is not the same as closing the app, deleting the indexes folder manually, and restarting the app. That could explain why some records might not be processed (processRecord), leading to an incorrect social graph. So that hypothesis means a bug in -db2, not in -friends.

staltz commented 2 years ago

Yeah, so I'm running an experiment here and something smells stinky:

  1. Make sure that feedlessPub is blocked (returns true for friends.isBlocking)
  2. Thus it gets deleted by friendsPurge
  3. Compact, and reindex everything
  4. ssb-friends doesn't process my record referring to feedlessPub, and the graph calculation ends up being hops 2
  5. On the next app restart, feedlessPub is still at hops 2, and will get replicated by ssb-ebt
  6. Force reindex
  7. Go back to step (4)

However, if I rm -rf ~/.config/manyverse/ssb/db2/indexes/contacts, then on the next app restart, feedlessPub will be at hops -1 and will get deleted by friendsPurge. Go back to step (2).

staltz commented 2 years ago

The smell is getting stronger, I think it's really here:

https://github.com/ssbc/layered-graph/blob/e29ff59b8a3f6e3aaabb2ebb86510f47f7958f5b/index.js#L40-L50

We call updatePublicLayer({}) and updatePrivateLayer({}) when we reset ssb-friends before reindexing everything, and it seems that update({}) doesn't actually clean the graph entirely. We probably need a way to fully reset layered-graph in memory.

arj03 commented 2 years ago

Yeah, that could very well be the case, this is probably not something properly tested earlier.

staltz commented 2 years ago

Okay, I'm still puzzled, I hacked layered-graph in Manyverse such that it has a reset() and I made sure that the layered-graph data structure is actually empty before the reindex of contacts starts, and for some mysterious reason my record that blocks feedlessPub was still not processed, and it ends up with hops 2.

In the processRecord, I printed out the seq and it's clear that it's indexing from the beginning of the log, all the way up to recent seqs, so it's not a problem of starting the reindex at the wrong point.

staltz commented 2 years ago

I must also mention that it's fairly common to see 5 or 6 of these errors while post-compaction reindexing is happening:

 code: 'EPARTIALREAD',
  path: '~/.config/manyverse/ssb/db2/log.bipf',
  offset: 994705408,
  size: 65536
staltz commented 2 years ago

I had another hypothesis as well: maybe AAOL's cache still contains old blockBufs after compaction ends. So I hacked AAOL to force-clean all caches and things like that, as soon as compaction is done.

This still didn't fix the issue and I kept getting feedlessPub at hops 2.

I think I'll pause debugging this :dragon: for today and see if sleeping over it helps me investigate better tomorrow.

staltz commented 2 years ago

More mystery:

  1. I closed the app
  2. Deleted rm -rf ~/.config/manyverse/ssb/db2/indexes/contacts
  3. Opened the app
  4. Indexing of contacts happened
  5. My contact msg for feedlessPub was NOT processed and I got hops 2

I repeated this process again and got the same result. Then I repeated it a third time and then I got feedlessPub at hops -1. What is going on? :mag:

staltz commented 2 years ago

We had a video call and after some debugging offline, I think I found the reason for this. Copied from our Signal chat:

in privateIndex, the streaming boolean is true whenever we use log.stream, EVEN IF we're doing a LIVE log.stream. So in the case of live streams, after you compact, if you get a new msg via EBT, that's going to write the msg to the log and all live streams are going to be aware of that msg, and that record gets passed to privateIndex and the offset for that record is really large, which bumps up the latestOffset.value from -1 to super high

as soon as I tweaked this part, I got hops -1 calculated during post-compaction reindex

and about EPARTIALREAD, I think I also figured that one out. after compaction, if there is a new msg written to the log, live streams will be "resumed", but their this.cursor has not been reset, so they're going to use that cursor on the log

staltz commented 2 years ago

SOLVED (see recently linked PRs)