paritytech / json-rpc-interface-spec

30 stars 3 forks source link

Reflecting on pinning, and an `unpinOldBlocks` suggestion #100

Closed jsdw closed 1 year ago

jsdw commented 1 year ago

I think we've talked about some of this a while ago, but having now just finished a first draft of the implementation of this stuff in Subxt, I wanted to reflect on block pinning.

The approach that I've taken in Subxt is to have a chainHead_follow subscription always running once Subxt connects. This stream automatically sends "unpin" messages to the node for blocks older than some configurable max age N, and will automatically restart if it's stopped. For each block hash provided by the node, the stream will emit a _BlockRef_ to consumers; a reference to that block hash that, when all clones are dropped, will lead to an "unpin" message being sent to the node for that block. The higher level APIs in Subxt then store BlockRefs as needed to keep blocks from being unpinned for as long as possible.

Any blocks that get too old (older than the configured N) are automatically unpinned to try to prevent the entire subscription from being stopped and all blocks being unpinned. In other words, we provide a "sliding window" of blocks that can be pinned. We keep blocks pinned for as long as we think we can, but allow them to be unpinned quickly too (as soon as the user Drops all references to them).

The downside of this approach is that it does not relay to the user any strict guarantee that some block (or a BlockRef here) will always be pinned. If we wanted to provide this guarantee (eg by never automatically unpinning blocks, and emitting "stop" events to the user on some channel), we'd instead run into cases where even blocks the user had just seen could become immediately unpinned and inaccessible. This is why I have a preference for a sliding window approach with a looser guarantee. If the user never holds onto old blocks, it's not an issue either way. If they do, then the old block(s) are unpinned rather than everything being unpinned (and the subscription restarting, possibly at a super inconvenient time and possibly leaving us with a gap in block information).

An issue with my sliding window approach is that I don't actually know how long the server will keep blocks pinned for. It's not specified. So I have to pick my max age N (or let users pick it) to be a value that may be far more conservative than it needs to be.

Because of this, I'd love to have a flag like "unpinOldBlocks" that I could give to the chainHead_follow subscription that would make it automatically unpin blocks rather than stopping the subscription outright. Perhaps it would also emit an event telling me which blocks it had unpinned when this happens. This would allow the guarantee to say "all block hashes given are pinned unless 'stop' event or 'unpin' event with the corresponding hash are received". Further, even if this automatic unpinning were the default behaviour, one could recover the existing behaviour by ending the subscription on either "stop" or "unpin" events, forcing everything to be unpinned.

Anyway:

  1. @josepot I'm curious to know what approach you're taking; it'd be interesting to compare and contrast and generally have your thoughts.
  2. @tomaka I'd love to have your thoughts on this approach in general and such a pinning event; there could be strong reasons against my unpinOldBlocks suggestion or in general that I've completely overlooked!
tomaka commented 1 year ago

The first and foremost downside of the unpinOldBlocks suggestion is that it adds complexity. I'm trying as much as possible to avoid any unnecessary complexity.

It's not specified.

I feel like the core of the issue is actually more about the fact that the limit isn't really specified.

jsdw commented 1 year ago

I feel like the core of the issue is actually more about the fact that the limit isn't really specified.

Yeah, part of the issue is that I don't know what the limit is, and so I have to pick a more conservative value to unpin old blocks on my side, meaning a user can't keep them around as long as they could.

I'm not sure that any sort of limit should be specified though; I can see that light and full nodes might be able to keep very different numbers of blocks pinned (and I could imagine being able to configure this on a full node for instance). Perhaps a "minimum time that a block must be able to be pinned for" could be specified, but the client is still not be able to keep blocks around as long as it maybe could.

So, I still think that the server could help us out here, since it knows best. An unpinOldBlocks like thing would allow the client to keep blocks pinned for the maximum possible amount of time without worrying about the whole subscription being stopped, and so this seems optimal to me. Alternately, an API could exist to tell me what the limit is, but even with this I still have to be a little conservative, and I still have to do more book keeping to keep track of block ages in order to use this.

The first and foremost downside of the unpinOldBlocks suggestion is that it adds complexity. I'm trying as much as possible to avoid any unnecessary complexity.

I understand the desire to limit complexity, but does this really add much? Instead of the server emitting a "stop" event at the relevant time, it unpins the block and emits an unpin event instead.

tomaka commented 1 year ago

So, I still think that the server could help us out here, since it knows best

Well, it doesn't know best actually. The server just picks an arbitrary value, and has no idea whether this value is good enough or not.

I understand the desire to limit complexity, but does this really add much? Instead of the server emitting a "stop" event at the relevant time, it unpins the block and emits an unpin event instead.

Yes it does! Like, a lot. Just put yourself in the head of someone trying to understand the API.

Also, you can't say "the" block. Which block the server would unpin is completely arbitrary and unclear and would vary amongst implementations. This means that when connected to some server implementations the JSON-RPC client would work just fine, and on others it would display errors because the block it was accessing was unpinned by the server.

Perhaps a "minimum time that a block must be able to be pinned for" could be specified

So, originally the way the limit was counted was left at the discretion of the JSON-RPC server.

However, I've realized afterwards that the only way that actually makes sense was to count the number of pinned finalized+pruned blocks and not count the number of non-finalized blocks. See https://github.com/paritytech/json-rpc-interface-spec/blob/7de90b101c6ccaeee257a6e556e1ed321ee64f7f/src/api/chainHead_unstable_follow.md#L305.

Therefore I think that a good way to do it could be to specify a number of finalized+pruned blocks that can be kept pinned (maybe add a function to query the server what this number is).

Another possibility could be for the JSON-RPC server to simply stop emitting events as long as the number of pinned finalized+pruned blocks is above a certain threshold.

jsdw commented 1 year ago

Yes it does! Like, a lot. Just put yourself in the head of someone trying to understand the API.

I am that person :)

In fact, my optimal preference would be that there would be no extra unpinOldBlocks flag at all, and the current behaviour would change from emitting "stop" and unpinning everything when some block(s) can no longer stay pinned, to emitting "unpinned: blockHash" when a block can no longer stay pinned.

For me, this would allow one to reprodeuce the same logic as today (end the subscription when the first "unpin" event is emitted) but would also make it simpler to keep alive a single chainHead_follow subscription without fear. Maybe "stop" events are still emitted under some circumstances, and in those cases I'll have to just restart the subscription, but I expect that also in those cases there would have been nothing I could have done to avoid that.

The unpinOldBlocks flag is just a compromise. I don't personally think it makes anything much harder to understand (once you understand the pinning stuff, this is just a small option to "auto unpin" for you; nothing crazy complicated).

Therefore I think that a good way to do it could be to specify a number of finalized+pruned blocks that can be kept pinned (maybe add a function to query the server what this number is).

This was my alternate idea, and while I think it's a bit less optimal from the client side (I still have to have logic to unpin blocks, and thus track block ages etc), it would be an improvement from what we have now, which is not knowing the number at all and having to be very conservative.

Another possibility could be for the JSON-RPC server to simply stop emitting events as long as the number of pinned finalized+pruned blocks is above a certain threshold.

That sounds scary! Would the server no longer pin newer blocks while it's not emitting events? It would be the opposite of what I'd be aiming for anyway, which is doing my best to keep newer blocks pinned as long as needed and letting old blocks be unpinned (if they haven't already been) as a compromise.

jsdw commented 1 year ago

Closing this in favoru of #109, which removes the idea that this is an opt in behaviour.