[executor] Parallelization Strategy: To State Key or Not to State Key

aaronbuchwald commented 7 months ago

The HyperSDK currently uses pre-specified StateKeys to allow transactions/actions to both pre-specify the state they can touch throughout their execution, so that the state can be efficiently pre-fetched and to enable efficient parallel transaction execution (https://github.com/ava-labs/hypersdk/blob/968ecec17a8ebb4a3067513d4bab951f1949243d/README.md#parallel-transaction-execution).

The executor package uses the pre-specified state keys to generate a conflict graph on the fly, so that sequential transactions are executed only after all conflicting transactions that precede them have finished execution. https://github.com/ava-labs/hypersdk/blob/968ecec17a8ebb4a3067513d4bab951f1949243d/executor/executor.go#L22

This creates a potential UX problems for HyperSDK devs/users. Users need a way to either deterministically generate the set of state keys their transaction will touch (easy for simple interactions such as simple transfers) or an estimation method that provides a best guess of the state keys that will be accessed. If the estimation method provides an incorrect set of state keys, then the user could either overpay for their transaction or see their transaction fail when it attempts to touch state that it has not pre-specified the required state key.

There are a number of different ideas that have been proposed to optimize for state key estimation:

1) Program provides its own state key estimation function aware of its own internal logic 2) API simulates a transaction to offer a relatively naive estimate 3) Specify a "global" state key and charge accordingly

The other alternative is to use optimistic concurrency control instead of pessimistic (state keys approach). For example, BlockSTM provides one option to use OCC to execute transactions in parallel, detect conflicts on the fly, and potentially rollback and re-execute transactions as needed.

There are two main drawbacks to the BlockSTM/OCC approach.

1) The potential to rollback and re-execute transactions 2) There are no state keys to pre-fetch state.

For (1), this is a feature of OCC vs. PCC performance and depends on the exact workload, which likely depends on what you're building on top of the HyperSDK. Typically, low conflict workloads perform better using optimistic concurrency control and vice versa for higher conflict workloads.

The bigger optimization for state keys comes from pre-fetching state in parallel to ensure that executing programs never need to interrupt execution to read from disk. One major downside of the BlockSTM/OCC approach is that you don't know what state keys must be fetched from disk until reaching that point in the execution, so that it's much more difficult to efficiently parallelize disk reads during program execution.

However, if the goal is to optimize for absolute maximum performance, then maintaining a capped state size entirely in memory and eliminating the pre-fetching should outperform regardless.

kpachhai commented 7 months ago

Thank you for your detailed insight into the parallelization strategy and the thoughtful considerations around the use of StateKeys in HyperSDK.

I personally lean towards maintaining the current use of StateKeys.

The StateKeys system offers a structured and predictable way to manage state access. It enables efficient parallel execution by allowing transactions to pre-specify state access, thereby minimizing conflicts and optimizing performance. This predictability is not just a technical advantage but also a user experience benefit, making it easier for developers to reason about their transactions and how they will behave in the system.

I understand the concerns around user experience, especially when it comes to more complex transactions where predicting state access can be challenging. However, I believe the solution lies in improving the tools and methodologies for state key estimation, rather than moving away from StateKeys altogether. Enhancements in tooling could include more sophisticated static analysis tools or runtime simulations that can help predict state access more accurately, thus reducing the risk of failed transactions or inefficiencies.

The idea of incorporating elements of optimistic concurrency control (OCC) is intriguing and certainly has its merits. However, I’m concerned that such a shift could introduce significant complexity into the system, both from a development and operational perspective. The potential for transaction rollbacks and the lack of pre-fetching state in OCC could lead to unpredictability in transaction execution times and performance, which might negate some of the benefits of parallel execution.

aaronbuchwald commented 7 months ago

That's great feedback.

Optimistic should outperform pessimistic on some workloads, so it's primarily a question of whether to be opinionated and choose one approach (neither option clearly dominates imo, but there's huge value in being opinionated at the very least to reduce maintenance burden) or whether different workloads and expected configurations (like holding the entire state in memory) justify building/maintaining both options.

kpachhai commented 7 months ago

Thanks for highlighting the nuanced trade-offs between optimistic and pessimistic concurrency controls. I agree that OCC has the potential to outperform pessimistic approaches like StateKeys in specific scenarios, particularly in workloads characterized by low conflict rates. The decision seems to boil down to whether we should adopt a more opinionated stance for the sake of simplicity and reduced maintenance, or whether the diversity of potential use cases justifies the complexity of supporting both concurrency models.

Considering the wide range of applications and workloads HyperSDK might cater to, I'm inclined to think that there might be merit in exploring a more flexible approach that does not strictly commit to one concurrency model. However, I also recognize the importance of maintaining a lean and manageable codebase, as well as providing a clear and straightforward path for developers using the SDK.

Perhaps a middle ground could be found in offering a default, opinionated setup that uses StateKeys, while also providing the infrastructure for OCC to be utilized in cases where developers are dealing with low-conflict, high-performance workloads and are willing to manage the additional complexity. This approach could allow us to cater to a broad spectrum of needs without overly complicating the core SDK.

We could also consider developing a set of guidelines or a decision-making framework to help developers choose the most suitable concurrency model for their specific use case. This might involve detailed documentation, performance benchmarks, and tooling that can assist in evaluating the trade-offs of each approach based on the characteristics of the workload.

aaronbuchwald commented 7 months ago

Ya it would be nice to make it pluggable, choose a good default for now, and come back to it if it becomes a bottleneck or it becomes clear that it poses a significant UX/DevX problem in the future.

The more difficult part to handle is that the state keys are part of the transaction format as well, so it wouldn't be a simple drop-in replacement, but require changes at another part of the stack too.

richardpringle commented 5 months ago

@aaronbuchwald we could be a lot more detailed than just having "global" as described in (3). If a program/contract is built in a way that leads to high variance in state-key access, it should IMO, be possible to specify a subset of keys. Something like <ProgramId>/* where the particular program is completely bottlenecked, but that doesn't affect the rest of the transactions that don't touch this subset of the state.

I could see a world where programs/contracts are developed iteratively. The naive approach is deployed first where most users would use the <ProgramId>/* state-key, then the author could work on a V2 that adds further optimizations for determining state-keys.

aaronbuchwald commented 5 months ago

@aaronbuchwald we could be a lot more detailed than just having "global" as described in (3). If a program/contract is built in a way that leads to high variance in state-key access, it should IMO, be possible to specify a subset of keys. Something like <ProgramId>/* where the particular program is completely bottlenecked, but that doesn't affect the rest of the transactions that don't touch this subset of the state.

I could see a world where programs/contracts are developed iteratively. The naive approach is deployed first where most users would use the <ProgramId>/* state-key, then the author could work on a V2 that adds further optimizations for determining state-keys.

Absolutely. We could have transactions specify a range of keys or a particular prefix. This would be very powerful to make it easier for people to reason about their transactions.

I'd expect that it's very rare for a user transaction to interact with a completely unexpected program, so if we supported something like <ProgramId>/*, then I'd expect users could at least do that to interact with all of the applications they interact with.

This could also complicate the pre-fetching and pricing logic in the HyperSDK though. Currently, the HyperSDK pre-fetches all specified state keys, so it would need to iterate all keys under a <ProgramId>/* or other prefix in order to fetch all of the relevant keys. This may be wasteful or prohibitively expensive for contracts with a large amount of storage keys.

I also assume the HyperSDK charges for the number of state key reads up front ie. before reading from storage. If we need to iterate a storage range, then we may not know how many reads to charge for until after we've completed the operation.

aaronbuchwald commented 5 months ago

I suppose we could create support a range that specifies a prefix and a maximum number of keys to charge for, so that we can charge up front

0xduality commented 4 months ago

Wondering if it would be possible to support both early binding (all keys specified upfront) and late binding (no keys upfront). What would that mean? No keys specified basically means the transaction is conflicting with everything, and nothing can be prefetched. The non-prefetched reads/writes should be metered way more aggressively. The point is to make the behavior possible but much more expensive. For example, suppose we run a workload of exclusively late binding transactions and observe 100x drop in throughput compared to early binding. Then one late binding transaction should cost 100 times more than an early binding transaction.

The rationale behind this suggestion is that the lack of support for late binding on Solana has been cited by the layer zero team as one of the reasons it is taking them a long time to support it.

github-actions[bot] commented 2 months ago

This issue has become stale because it has been open 60 days with no activity. Adding the lifecycle/frozen label will exempt this issue from future lifecycle events.

ava-labs / hypersdk

[executor] Parallelization Strategy: To State Key or Not to State Key #736