waku-org / pm

Project management, admin, misc
3 stars 1 forks source link

[Status App] Enabling Waku Store over a multitude of unreliable nodes #21

Open fryorcraken opened 1 year ago

fryorcraken commented 1 year ago

User Story

As a Status Community owner, I want to rely on community members to share their available resources to support a community. I do not want to pay for infrastructure or rely on 3rd parties such as Status.im.

This includes the infrastructure to enable users to retrieve messages when they were offline.

Problem

The current applicable logic to retrieve offline messages is as follows:

  1. Client (Status Desktop, a.k.a Alice) uses DNS Discovery to connect to a Status fleet (bootstrap) node.
  2. Alice does a number of Waku Store queries to retrieve missed messages from bootstrap node. Queries are all done to same node., some queries must be sequential (page by page using cursor).

If a client cannot rely on Status fleet node for Waku Store, then it would need to:

  1. Alice find node(s) (a.ka. Bob) running Waku Store for their community (ie, shard) using discv5
  2. Alice runs Waku Store queries to Bob.

A number of challenges rise:

  1. How does Alice find Bob. Alice needs to find a (or several) Bob(s) that:

    1.a. is(are) online.

    1.b.. Stores messages she is interested in If 1 community = 1 shard Alice needs to know that Bob stores message for this shard If communities share shards, there may need to be a way for Bob to express what content topics he serves/store

  2. How does Alice know that Bob has a complete history. This is assumed in the current model, because nodes are run in the cloud and monitored by Status ops. However, this is not applicable anymore when the node is running on a laptop that may itself be online.

  3. How does Bob ensures it completes its own history and is even aware of messages he is lacking.

  4. How does Alice efficiently uses her available bandwidth, knowing that most home connection have higher download bandwidth (over upload). Note that protocols such as BitTorrent are designed to solve this problem.

  5. How does Bob avoid compromising his home Internet by continuously serving Store queries

  6. How does Bob select messages to save for the purpose of serving Store queries.

Notes/Risks

(Waku Scope) Data Availability Guarantees

Data Availability Guarantees does not fall in the domain of Waku (ephemeral messaging) but Codex's.

If Data Availability Guarantees is needed, then (an altruist version of) Codex is likely to be the right solution as a datastore.

(Store Scope) Syncing mechanism != message dessimination

We need to ensure that a sync protocol does not become the default way to disseminate messages in the network, as this will work very similar to a flood-routed network and severely impact our scalability assumptions and modelling.

Application Level Constructs/Efficient use of Store

Not all messages are likely to need syncing. Hence, the application should use information to know what messages to store and sync.

For example, a number of control messages may not needed to be sync'd (such as online presence or community description) or only be sync'd within a different time frame (ie, only sync 24 hours of messages).

Assumptions (to be confirmed)

History Availability

Status Desktop has recently been changed to download all 30 days of history for all the users' communities at start up.

Can we rely on the assumption that any community member aims to store 30 days of messages? Or, will this be configurable per user, meaning that some user may store less history?

This can help with: (4) If all nodes thrive to have a complete history, it simplifies the logic. (5) Would then be covered because Bob always aim to have a complete history. However, further logic may be needed.

jm-clius commented 1 year ago

Thanks for this!

Some related research issues: https://github.com/vacp2p/research/issues/64 https://github.com/vacp2p/research/issues/81 https://github.com/vacp2p/rfc/issues/420

Can we rely on the assumption that any community member aims to store 30 days of messages?

As mentioned elsewhere, the 30 days of history as seen and required by the app (and stored in the app-managed local DB) may look quite a bit different from 30 days of network history as seen by Waku Store (agnostically operating on still-encrypted, blob-like Waku Messages). There may be ways of consolidating this into a single, unified Store, but would require closer inspection.

fryorcraken commented 1 year ago

https://github.com/waku-org/research/issues/1#issuecomment-1566695337