ueberdosis / hocuspocus

The CRDT Yjs WebSocket backend for conflict-free real-time collaboration in your app.
https://tiptap.dev/docs/hocuspocus/introduction
MIT License
1.21k stars 117 forks source link

Add storage queues #749

Open georeith opened 10 months ago

georeith commented 10 months ago

We store the document in multiple ways, we store the CRDT data in one location and a JSON snapshot in another.

We have different throttle requirements for each as one of our stores is rate-limited and the other isn't. Also we don't want one type of storage failing to skip the other.

So to combat this I've added the concept of separate storage queues. Nothing changes by default, but you can optionally assign a storage queue to an extension:

const myExtension = {
    storageQueue: 'myStorageQueue',
    async onStoreDocument() {},
    async afterStoreDocument() {},
}

And configure that storage queue to debounce separately at the top level:

new HocusPocus({
    extensions,
    debounce: 500,
    async onStoreDocument() {},
    storageQueues: {
        myStorageQueue: {
            debounce: 0
        }
    },
});

Now the onStoreDocument at the top level and in your extension run in two separate hook chains at their own throttled rate.

Multiple extensions can share the same storage queue by using the same string identifier. These will be chained off each other like usual.

georeith commented 9 months ago

@janthurau would love to get your thoughts on this.

janthurau commented 9 months ago

Thanks a lot for your PR! I've been thinking about this for some time, as I currently don't really see this fit in the Hocuspocus core, but as an extension. Having multiple different storage services where you need to debounce them separately sounds like a rather unique use case.

I haven't checked it now, but do you think this is possible to implement using an extension? You can probably either set debounce to 0 and then implement your own logic in onStoreDocument, or don't implement the hook at all and use a custom onChange hook? 🤔

georeith commented 9 months ago

@janthurau its of course possible to implement as an extension but it requires opting out of a large portion of the core and reimplementing a lot of the same concepts manually and loses the separation of onStoreDocument and afterStoreDocument, what makes HocusPocus so nice is that you can separate concerns in extensions and not tie things together. Reimplementing that on top would be clunky as it's duplicating the hook registry and chaining code effectively.

I imagine I'm not the only person who stores both CRDT and JSON snapshots of their data.

It is not just about the different debounce rates.

Currently if an earlier onStoreDocument fails then subsequent hooks are aborted which is not always ideal. It also means that if you do store both you can do so in parallel not in series, which is useful if your storage is slow.

It's a completely opt-in API, nothing changes if you don't use it but if you do want to store to multiple locations this gives you much greater control.

I understand where you're coming from but I think there's a lot of value in being able to support multiple storage locations better within the elegant HocusPocus extension API.

I would be open to altering the API to better fit your style, I just couldn't think of a neater way of describing it at the time. I thought about adding the per debounce settings to the extension level but "storage queues" have a one to many extension relationship.