electric-sql / electric

Local-first sync layer for web and mobile apps. Build reactive, realtime, local-first apps directly on Postgres.
https://electric-sql.com
Apache License 2.0
5.37k stars 124 forks source link

Chunk large satellite payloads #1399

Closed robacourt closed 3 days ago

robacourt commented 1 week ago

To prevent the 100MB payload limit:

[Symbol(kError)]: RangeError: Max payload size exceeded
      at Receiver.haveLength (/Users/rob/src/electric-sql/electric/node_modules/.pnpm/ws@8.17.0/node_modules/ws/lib/receiver.js:419:28)
      at Receiver.getPayloadLength64 (/Users/rob/src/electric-sql/electric/node_modules/.pnpm/ws@8.17.0/node_modules/ws/lib/receiver.js:406:10)
      at Receiver.startLoop (/Users/rob/src/electric-sql/electric/node_modules/.pnpm/ws@8.17.0/node_modules/ws/lib/receiver.js:161:16)
      at Receiver._write (/Users/rob/src/electric-sql/electric/node_modules/.pnpm/ws@8.17.0/node_modules/ws/lib/receiver.js:94:10)
      at writeOrBuffer (node:internal/streams/writable:564:12)
      at _write (node:internal/streams/writable:493:10)
      at Writable.write (node:internal/streams/writable:502:10)
      at Socket.socketOnData (/Users/rob/src/electric-sql/electric/node_modules/.pnpm/ws@8.17.0/node_modules/ws/lib/websocket.js:1303:35)
      at Socket.emit (node:events:519:28)
      at addChunk (node:internal/streams/readable:559:12) {
    code: 'WS_ERR_UNSUPPORTED_MESSAGE_LENGTH',
    [Symbol(status-code)]: 1009
alco commented 1 week ago

@robacourt Hey, I don't know if there was a discussion behind this that I missed, but this change doesn't address the issue stated in the PR description. For one, a single INSERT or UPDATE may be quite large, so the limit of 100 ops per SatOpLog does not in itself prevent it from overflowing.

More importantly, a single SatOpLog is treated by the client as a transaction. This change effectively splits a single PG transaction into multiple client-side transactions. This might not be an issue in the current implementation due to the fact that transactions are processed on the client in the same order as they are sent by the server, but any addition of concurrency to client-side processing further down the road could lead to subtle bugs because of this.

A proper solution to the payload size problem requires changing the Satellite protocol to support splitting a single transaction into multiple protocol messages.

robacourt commented 1 week ago

Great find! Would you mind adding a test to the client side (in clients/typescript/test/satellite/client.test.ts) that ensures the transaction is emitted correctly when begin and commit fall into separate oplog messages?

I think we have one already: https://github.com/electric-sql/electric/blob/31ffde3b0912d3b4e16ee38e5b44ae032b77cb48/clients/typescript/test/satellite/client.test.ts#L275

Is that enough?