ssbc / ssb-ebt

secure scuttlebutt replication with epidemic-broadcast-trees
MIT License
18 stars 10 forks source link

Difficulties replicating with peers that are friends after doing a feed recovery #38

Closed staltz closed 3 years ago

staltz commented 3 years ago

This issue in manyverse explains it properly: https://gitlab.com/staltz/manyverse/-/issues/1333#note_569151491

Quoting the relevant part:

I suspect this could be due to an update to ssb-ebt. In older versions, we used a hacked fork of ssb-ebt: https://github.com/staltz/ssb-ebt-fork-staltz/commit/7537cb00851e552c765f07621e3774470bccc621 but recently we started using ssb-ebt proper, because it had some updates and fixes.

Potentially related issue: #33

arj03 commented 3 years ago

Strange. I have not seem EBT causing trouble like this in browser core. Browser core has been using EBT 8 for a while, but is driving the EBT a bit differently than ssb-ebt because it needs it only to kick in after partial replication has run for a feed. It would interesting to see if you can reproduce this to see if its the change in rpc:connect that is causing this. Then we can have a temporary solution that works until we rework of all these modules and can do proper testing (codename: move outside range of the bermuda triangle). Alternatively revert back to your fork for now.

staltz commented 3 years ago

Alternatively revert back to your fork for now.

This would be simple and I don't see much downsides to it since we're going to anyway soon rewrite ssb-replicate, ssb-ebt, ssb-friends from ground up.

staltz commented 3 years ago

I want to finish this issue once and for all, so I did some extensive testing. First I tried to reproduce this issue #38 and issue #33 using ssb-ebt Tape tests. Couldn't reproduce.

Then I tried to reproduce it between Patchwork and Manyverse in the same LAN, using the typical reproduction steps, i.e.:

If Bob is using Patchwork 3.16.2, then Alice does not get her content back. That version had EBT disabled. If Bob uses Patchwork 3.17.1 or higher, then Alice successfully gets her content. That version had EBT enabled.

I'm not sure what happens when Bob is not following Alice (but Alice follows Bob), because under ssb-conn scheduler logic, Bob would not initiate a connection with Alice. And neither would Alice (because she doesn't know she follows Bob). But I'm okay with that corner case, because it can be considered either a bug in ssb-conn scheduler or a user mistake (if you want your data back, make sure to trigger the connections). Anyway I don't think it's an issue with ssb-ebt.

Sorry for the thorough message, but I'd like to give a definitive end to this issue. I want to stop using my hacky fork ssb-ebt-fork-staltz in Manyverse, and if people still encounter this issue, we can check:

And I don't believe there would be any other action necessary to take other than the 3 options mentioned above.

arj03 commented 3 years ago

Great debugging :)