Closed msfstef closed 1 month ago
Name | Link |
---|---|
Latest commit | a2b1fe61aff905b8cd0053d81e53c31a5674c3ef |
Latest deploy log | https://app.netlify.com/sites/electric-next/deploys/66e97387eb94b7000863cd91 |
Deploy Preview | https://deploy-preview-1731--electric-next.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
@KyleAMathews the idea is that you can moderate the back pressure any way you want. For example, the developer can have a processing queue, and provide a processing callback that just adds the messages to their own queue. That means the stream will continue to load from the network without actually waiting for the messages to be processed (as the message processing is really just scheduling them for processing in another system).
For example, the developer might provide a processing callback that schedules the messages for processing and monitors memory usage, and when a threshold is passed they can do something like waitForProcessing
within the processing callback to "pause" the stream.
Sure they can maintain their own buffer. But my point is that it's also a good idea for our client to maintain a buffer as well of one fetched batch of messages.
Aaah yes indeed - essentially what we can do is have a very thin network precache mechanism, such that the stream operates exactly as it does now but the fetch
will have loaded the next batch already.
In fact I think this might fit nicely with another fetch wrapper, like the one we have for backoff retries, that is custom to electric and the rest of the code is completely oblivious to this buffer.
sounds good π
I'm not suggesting by any means we do another refactor right now but something to look at and ponder is if we should pull in https://effect.website/ at some point β managing all these flows and error states, etc. might get easier.
As discussed with @kevin-dp, the
ShapeStream
currently runs ahead to the end of the stream regardless of what its subscribers do with the messages.This means that a stream will always accumulate all the messages from the network in memory regardless of the speed of processing of its subscribers, with no options for the developers to create backpressure and limit the rate at which the stream is read. This also leads to the side effect of the
ShapeStream
reaching anisUpToDate
state that is misleading, since there is no guarantee that its subscribers will also be up to date.The proposed solution is to get rid of all queuing mechanisms (less code, yay!) and make the
ShapeStream
wait for all subscribers to process the message batch before moving forward.This gives the option to the developer to regulate backpressure - they can still let the stream run ahead as it does now by providing a synchronous callback that schedules the message processing e.g. in a queue that they maintain. But they can also put a subscriber in front that asynchronously stores data to disk, and thus the stream can run a little slower to avoid memory buildup.
Basically via this API we allow for developers to flexibly decide how fast the stream will try to retrieve data from the network in a relatively intuitive way.
Of course this should be more clearly documented both in the code and the docs, but I want to have people's opinion on whether this is something we want or note before doing so.