plebbit / plebbit-js

A Javascript API to build applications using plebbit
GNU General Public License v2.0
41 stars 7 forks source link

Implement subplebbit.postUpdates #12

Open estebanabaroa opened 1 year ago

estebanabaroa commented 1 year ago

Having each comment create their own IPNS record has many issues. A better design is to store the post updates in an IPFS folder like:

<folderCid>/<post.cid>/update

it can be fetched by doing:

ipfs get <folderCid>/<post.cid>/update

and it gets a regular commentUpdate like before.

To insert files into a folder, you use the ipfs files API

const content = '{...}' // the postUpdate JSON
await ipfsClient.files.write(`/${subplebbitAddress}/postUpdates/${post.cid}/update`, content, {parents: true, truncate: true, create: true})
const {cid: folderCid} = await ipfsClient.files.stat(`/${subplebbitAddress}/postUpdates`)

// get the post update
const file = await ipfsClient.get(`/${folderCid}/${post.cid}/update`)

The problem with this design is that after 1000 or so files in the same folder, files.write becomes very slow. So we must split the post updates into different folders. The folders that need to insert new files frequently must not have a lot of files inside. So we must split the folders based on time, like this:

subplebbit.postUpdates = {
  '86400': <folderCid>, // 86400s, 1 day
  '604800': <folderCid>, // 604800s, 1 week
  '2592000': <folderCid>, // 2592000, 1 month
  '3153600000': <folderCid>, // 3153600000, 100 years
}

To decide which folder to put the comment in, the formula is comment.timestamp >= (subplebbit.updatedAt - 86400). Both the sub owner and client can programmatically decide which folder to use deterministically based on this formula.

If a subplebbit has 1000+ comments per day, more folders can be added, like per hour or half day, etc. The time amounts are not fixed, a sub owner can use any time amounts they want.

So the ipfs files write command would have an extra time subfolder like:

await ipfsClient.files.write(`/${subplebbitAddress}/postUpdates/${time}/${post.cid}/update`, content, {parents: true, truncate: true, create: true})
const {cid: folderCid} = await ipfsClient.files.stat(`/${subplebbitAddress}/postUpdates/${time}`)
subplebbit.postUpdates = {
  [time]: folderCid,
  // ...other times
}

// get the post update
const file = await ipfsClient.get(`/${subplebbit.postUpdates[time]}/${post.cid}/update`)

Fetching reply updates

It is not possible to do const file = await ipfsClient.get(/${folderCid}/${reply.cid}/update). Reply updates should not be stored in the top level folder of the subplebbit, because there would be too many files in the folder and it would not scale as much.

Instead reply updates should be store nested in their parent folder. Like:

`/${folderCid}/${post.cid}/${parentReply.cid}/${parentReply.cid}/.../${reply.cid}/update`

Fetching a reply update is slow, but it doesn't need to be fast because it's usually only done in the background for a user to get notifications to their own replies. It's rare to directly go to a reply.

The steps to fetch a reply update are:

  1. Get all parent cids recursively until post is reached by calling ipfs get <reply.parentCid>
  2. Get the post.timestamp by doing ipfs get <reply.postCid>
  3. Get the correct postUpdate folderCid, subplebbit.postUpdates[time]
  4. Get /${folderCid}/${post.cid}/${parentReply.cid}/${parentReply.cid}/.../${reply.cid}/update

Note: replies should not have reply.replies anymore, because this will take too much size to create as the sub owner. Only posts should have post.replies.

Note2: For posts that have too many replies that can't be all included into 1 page, like thousands of replies, we should probably have reply.replies in some strategic places to allow fast scrolling/loading of all replies, similar to how reddit has a "load more replies" button. This shouldnt be implemented yet, for now only posts should have post.replies.

To get replies, you must scroll replyUpdate.lastReplyCid and reply.previousCommentCid using ipfs get.

This is slow but acceptable because it's rare to need to fetch a reply directly. Usually it's only used to get notifications in the background.

Rinse12 commented 9 months ago

This issue is implemented, except for reply.replies as we haven't finalized its design yet. Currently, /path/update contains replies whether it's a post or reply.

estebanabaroa commented 1 month ago

fetching replies should probably be changed for this new design so there's less CIDs needed to reprovide

https://github.com/plebbit/plebbit-js/issues/45#issue-2498460495

4. Fetch update (votes/replies/etc) of your own nested reply

  • load reply.postCid
  • load post update
  • based on the reply.timestamp, scroll post.replies.pageCids.newFlat or post.replies.pageCids.oldFlat until find own reply reply.
  • note: this algo is slow, but looking for updates to your own replies is done in the background, the user doesn't wait for it, and using the instant replies design, the user can also join a pubsub topic to get updates in real time
  • note: another possible algo in very large posts with thousands of replies could be for reply to include reply.parentCids: string[] and reply.parentTimestamps: number[], this could be used to search for the reply in nested pages