solana-labs / solana

Web-Scale Blockchain for fast, secure, scalable, decentralized apps and marketplaces.
https://solanalabs.com
Apache License 2.0
12.95k stars 4.16k forks source link

Too widespread read-consistency issue for JSON RPC with load balancers #15705

Closed ryoqun closed 2 years ago

ryoqun commented 3 years ago

Problem

Validators could process at slightly different times. So, if there is a load balancer/reverse proxy (there SHOULD really is for any resonable situation...), web3 developers and web3's downstream libraries are forced to consider the reversion of any state. ie. non-humane burden on them. ;)

fun quiz time! I wrongly implemented this: can you spot the bug?: https://github.com/solana-labs/solana-program-library/pull/1364/files#diff-a89195d5c0dbd3f581bf7b62f88fe627d94149501fd9006ea5852eba7f3adda0R573

Especially this is exaggerating and so frustrating to end users under unstable cluster condition.

Proposed Solution

~Client side only idea: tracking returned context.slot from every rpc responses in web3.Connection for the case of singleGossip and stronger commitment internally and retry safe methods~ => well this is kinda clumsy. we need to discern which method is safe or not.

My current fav idea: Introduce custom universal http header for all JSONRPC methods: X-SOLANA-SLOT: NNNN.

So, this works like this: Once web3.Connections gets its first response from jsonrpc, it remembers X-SOLANA-SLOT value keyed by commitment. Then, subsequent req have the custom header. Then, server checks the slot against fetched Bank from BankForks: if NNN <= bank.slot: ok, time is forwarding process the req. Then, client remembers the new NNN value. if NNN > bank.slot: nah, return 4XX because clients from the future. Then client retries at some time later (like 400ms~3000ms, exponentially?)

jstarry commented 3 years ago

I think this is an additional consequence of the RPC node instability. Looks like google cloud makes a best effort at session affinity that is broken if a node becomes unhealthy. I expect that if we fix rpc stability, this will no longer be an issue. For that reason, I sort of prefer your client side only idea as a quick fix

jstarry commented 3 years ago

Sorry forgot to paste the link: https://cloud.google.com/load-balancing/docs/backend-service#session_affinity

ryoqun commented 3 years ago

@jstarry Yeah, load-balancers session affinity is definitely a good consideration. :) I didn't mention it (my bad) but I rather wanted to avoid it because...:

brianlong commented 3 years ago

Tagging @linuskendall & @stakeconomy who are currently working on HAproxy config for the new RPC pool.

mvines commented 3 years ago

The JSON RPC protocol already offers a solution. You can batch multiple requests in the same HTTP POST (until the load balancers get much smarter).

For example these two requests are guaranteed to be served by the same backend:

$ curl https://api.mainnet-beta.solana.com -X POST -H "Content-Type: application/json" -d '
  [
    {"jsonrpc":"2.0", "id":2, "method":"getSlot"},
    {"jsonrpc":"2.0", "id":1, "method":"getRecentBlockhash"}
  ]
'
[{"jsonrpc":"2.0","result":67811128,"id":2},{"jsonrpc":"2.0","result":{"context":{"slot":67811128},"value":{"blockhash":"5XaNf8C3KK2gS6Ymn9AdZ6HBZCh412afNtWJfi7jf63o","feeCalculator":{"lamportsPerSignature":5000}}},"id":1}]
ryoqun commented 3 years ago

@mvines Yeah, good point. I again forgot to mention why I don't think jsonrpc's batching is adequate solution for the problem... :

Alternatively, I think these problem can be solved if we use websockets for these usecases. But then, we don't provide all of functionalities via websockets. (Of course, we can) All being said still, it seems odd we can't guarantee read consistency across plain old reqs (jsonrpc), but for websockets.

we have nice synchronization primitive called slots for confirmed and stronger commitment, why not use it? :)

I think our discussion is maturing... Thanks!

linuskendall commented 3 years ago

I've been setting up a proof of concept set of loadbalancers off of GCP for the public Solana RPC pool. We could potentially support the X-Solana-Slot header directly in the load balancer, with the loadbalancer having an idea of the last slot call returned from any given backend (and routing requests accordingly). We use the slot anyway as a way for health checks in the loadbalancer so I think this could be extended to support something like the X-Solana-Slot header.

Affinity is also something I have considered and can be toggled via cookie or custom header for those clients who need it. This doesn't help if the backend RPC goes down, but our aim is for that to be a relatively rare occurrence (with the LBs protecting them to a greater degree than is currently done under GCP).

mvines commented 3 years ago

Seems like this can still be done fully client side, no new header needed that all LBs may not add support for.

The client just batches a getSlot request alongside its actual request when it wants to assert a minimal slot value. If getSlot returns a lower slot the client middleware retries. This could be an opt-in feature of the RpcClient for Rust and the solana-web3 Conection object for JS

ryoqun commented 3 years ago

Seems like this can still be done fully client side, no new header needed that all LBs may not add support for.

The client just batches a getSlot request alongside its actual request when it wants to assert a minimal slot value. If getSlot returns a lower slot the client middleware retries. This could be an opt-in feature of the RpcClient for Rust and the solana-web3 Conection object for JS

Yeah, this sounds nice way to put the slot information while avoding the http header. The only downside is sendTransaction, which is rather important.

we need to discern which method is safe or not.

Actually I thought about this idea.. But, i failed to communicate well. Let me elaborate this too abstract description by me. ;)

Say, a user wants to send a transaction assuming the current slot is at N and its relevant account state is favourable which can only be determined by human at client side by examining the state with a bank at the slot N or newer banks. Also assume the on-chain program enforces some rule to reject/accept such TX at older/newer than N (this will be pretty common for games).

Then, the user actually sends the transaction, batching as [getSlot, sendTransaction, confirmTransaction, getAccount] or [getSlot, sendTransaction], (then), [confirmTransaction, getAccount]. Then, out of luck, the request is routed to behind-cluster api backend. The backend simulates the TX at slot N-1. Then consider the two cases:

(a) If the tx is valid action at N-1 but actually it isn't valid action at N+1, the transaction will be recorded into the ledger as failed tx, which could give unfair edge to the use's counter parties as some public hint of the user's intention. Also, the user (browser) waits rather long time for the confirmed commitment at the delayed node's stand point, to only know the tx has failed and needs to retry, resulting in rather critical information disclosure and opportunity cost. Also this incurs wasted TX fee.

If we add X-Slot and server-side rpc checks that, we can avoid this. Also, because the X-Slot information in on wire, smart load-balancers or our rpc-code can do smarter things like re-routing or buffering a bit. (time sensitive application's special api node cluster deployment are likely to employ these practices, anyway; but then why not support it officially?).

(b) If the tx isn't valid at N-1 but actually is valid action at N+1, Then user's browser immediately knows the tx failed and retries. But it's completely up to luck whether or not they access the healthy node or the delayed code catches up next time. Meanwhile, another user can take the opportunity. So the original user misses the opportunity and feel sad in this case too. (this assumes disabling the tx simulation isn't good because of general practice and to avoid needless information disclosure like above).

Also, for the mobile clients, the bandwidth and client-side's parsing (especially JSON) budget is rather scarece resource. So, I don't want to force those browsers to download potentially large Account state (could be batched for less round-trip) and parse and abort&retry only to find the N < minimal_slot. In comparison, header value lookup and parsing should be instant.

no new header needed that all LBs may not add support for.

Also, I think custom http headers isn't uncommon; you can see various ones from well known apis too. LBs generally passes through them, rather than discarding the unknown, especially if it is prefixed with X-. We just need white-list them for CORS, which is safe in this case.

I've been setting up a proof of concept set of loadbalancers off of GCP for the public Solana RPC pool. We could potentially support the X-Solana-Slot header directly in the load balancer, with the loadbalancer having an idea of the last slot call returned from any given backend (and routing requests accordingly). We use the slot anyway as a way for health checks in the loadbalancer so I think this could be extended to support something like the X-Solana-Slot header.

@linuskendall Yeah, some smart load-balancing possibility is a nice bonus with X-Slot. note that slot usually must be paired with confirmation. Depending on actual implementation complexity and design compromise, we might add the commitment value to the custom header or not.

rawfalafel commented 3 years ago

Hey, I'd like to bump this issue since I think its fairly important. As devs, the problem we're facing is that we can't make sequential requests to an RPC server while guaranteeing that the slot number is monotonically increasing. I want to stress that this is a big problem for apps that are built on Solana. If not handled, it can lead to users thinking that they've lost their funds. We've seen this happen in the wild.

You guys are best equipped to find the right solution, but my intuition is that this should be solved at the network level and not in the Javascript SDK. I think this is too critical to handle on the client side.

Just my two cents. Thanks :smiley:

mvines commented 3 years ago

the problem we're facing is that we can't make sequential requests to an RPC server while guaranteeing that the slot number is monotonically increasing

If you make a request into a single RPC server, you do have this guarantee. Your problem manifests only when multiple RPC servers are load balanced together. The quick solution is to run application-specific RPC servers so that you have more control over how, if any, load balancing occurs.

It would be good to better understand the assumptions that your client is making that breaks when you aim it at a load-balanced RPC endpoint. Perhaps there's client changes we can make that'll be quicker than rebuilding RPC (which I'd love to do one day, but that's a large project). Happy to take this to Discord #orca as well!

ryoqun commented 3 years ago

I'm always trying to find some time to work on this; but have been failed so far...

Anyway, I've fixed/found read-consistency issues: https://github.com/solana-labs/solana-program-library/pull/1528, https://github.com/solana-labs/solana/issues/13987#issuecomment-813404371

ryoqun commented 3 years ago

another sad fact I recently noticed is that https://dex.projectserum.com (and bunch of other downstream projects if they're using the same api backend) is slightly affected by this as well. I actually observed that the orderbook's state reverted a bit for a moment for sure. I mean order's remaining amount is increased for a bit while steady reducing otherwise most of time.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] commented 2 years ago

This stale issue has been automatically closed. Thank you for your contributions.

jstarry commented 2 years ago

I would like to add an rpc param called "minimumContextSlot" to have rpc's return an error if they are not caught up. Many RPC requests are called with the expectation that a node has progressed to a particular slot at some commitment. So I think that setting a minimum context and getting an error code that indicates the node isn't ready yet and that the client should retry later will be a better dev experience.