Open jakmeier opened 1 year ago
I've looked at what mainnet traffic currently looks like. Very roughly, the summary is:
token.sweat
record_batch
by oracle.sweat
makes up about 20% of that but is gas heavier than the average txsweat_welcome.near
makes up about 15%ft_transfer
receipts, triggered by claims or other sourcesBased on that rough analysis, I want to create a benchmark with 20M daily transactions distributed like this:
This tries to project how traffic could grow from today (300k daily tx) to 20M daily tx. One assumption we established at the start of the quarter was 5M of that should be from social DB. The remainder I have now defined.
I allocate 10M to FT use cases because Sweat today already takes >50% of TPS. But instead of forcing it all onto a single account, I want to spread the load across shards, so I split it evenly between Sweat and "other" FTs. That could also simulate more loyalty-program-like use cases coming to near, which isn't unlikely given the success of Sweat.
Then I thought aurora and nearcrowd shouldn't be missing either, given they have >10% of the tx volume each. Aurora is slightly larger today, so I think 3M aurora & 2M nearcrowd makes sense. Note that aurora transactions also consume much more gas than nearcrowd transactions.
Notably, de-fi is missing completely. We could add that later but for now I want to focus on the more common and more performance-critical use cases.
cc @akhi3030 @bowenwang1996 please let me know if you have concerns regarding my proposed benchmark workload
This seems like a good approach to me.
Currently the plan is to:
I have been running with Sweatcoin batches for a while now and I managed to make it work. However, there is one remaining annoying issue: record_batch
can only be called by a registered oracle account, and registering an oracle can only be done by the owner of the contract account.
Currently I create one oracle for each worker. The problem then is that all users of the worker will share one oracle. This results in nonce conflicts. Sometimes this results in a simple retry and things work fine after that. But the most annoying case is where the RPC node accepts the transaction, as the nonce would be valid at the time, but the validator filters it out in its tx pool. Then the user ends up waiting forever (cut off is currently at 30min) and never see a useful error.
I think I can resolve it by using multiple keys per oracle but I'm not sure if I'm willing to spend the extra effort on this just yet.
For the TPS benchmark, we know that we want some SocialDB workload (see #9095) but we want to combine it with some other "typical" workload.
For this, we probably want to mimic workload observed on mainnet. Ideally it should include traffic from our largest users atm (sweat, aurora, ...) but also non smart contract workload (creating accounts, near transfers, etc).
All if this should be defined in the locust setup.