Consider increasing fees for writable accounts

aeyakovenko commented 2 years ago

Problem

Bots spam the network, often with failed transactions. Well behaving senders are able to avoid failing txs if the senders are signing a transactions that are simulated against recent state. Malicious senders simply flood the chain with pre-formed transactions as fast as possible.

The flood of transactions can occur if there is a known opportunity that is scheduled, like Raydium IDO, or NFT mint. Or opportunistically during high volatility in markets. Programs can't just charge a large flat fee for small trades or all transactions, because attacker can write a custom program to check if there is liquidity and only then execute, but send the TX flood anyways. The flood will take write locks on state, and all other users will be starved. The program can't defend itself against being simulated.

Proposed Solution

This proposal is in addition to the market driven additional_fee signed by users to prioritize access to state. #23211

get_active_account_write_lock_fee system call: return the current activated write lock fee for this account
update_account_write_lock_fee system call: A system call that allows a program to set a write lock fee for accounts that it owns. The fee is applied at the start of TX processing to the original write locked accounts such that the write lock account data remains immutable, but the additional lamports are committed even when the TX fails. Reducing the fee can be activated immediately, increasing the fee has to wait for at least 240 blockhashes, or the current system wide blockhash timeout.

It is up to the program to distribute the lamports back to successful callers at the end of the call. Program would need to guard that it's not being called twice within the same transaction, and only refund on the first call. A re-entry safe helper function should be provided to the program so it refund the current fee to the transaction fee payer.

total_failed_lamports += max(0, current_lamports - (rent_exempt_lamports + write_lock_fee))
refund = current_lamports - total_failed_lamports
- total_failed_lamports would need to be tracked by the program's state.

Program can implement an eth eip1559 like mechanism by periodically setting the write_lock_fee with the MIN additional_fee paid by callers. MIN would be the min price a caller had to be included in the block to take this write lock.

Implementation needs to have an activation slots longer than the 240 slot timeout for a blockhash. so users know the expected fee they are signing. Transactions using durable nonces may need to specify the maximum fee they are willing to pay including write lock fees.

Why it’s better then only `additional_fee`

The huge advantage to this, is that it’s possible for a program to build its own congestion control and capture fees and punish misbehaving senders and refund well behaving senders.

Example NFT drop

Candy machine is deployed
Artist sets the write_lock_fee for the candy machine mint to 0.1 sol
candy machine is configured to refund the caller on a successful mint
any bots that try to spam the network prior to the mint fail and get charged 0.1 sol

with fewer bots bidding, there is less load on the leaders and UX improves.

Example defi market

high volatility causes prices to move and creates tons of arbs
Market continuously sets the write_lock_fee to the MIN additional_fee over the last 10 slots
Every epoch, market does a buy/burn of its marketplace token with the collected fees

Market created the demand for state, and market is now capturing value that otherwise would go to the L1 only.

tag @taozhu-chicago @sakridge @jackcmay

jstarry commented 2 years ago

Why does the "unused" part matter? Why not charge increasing fees for every write?

aeyakovenko commented 2 years ago

@jstarry why didn't the sender mark it as RO then? The example above illustrates how programs can use this to reduce traffic from aggressive senders that do not observe the state often enough before they sign.

jstarry commented 2 years ago

I don't think that focusing on unused writable data accounts adequately covers the issues with write-lock contention. It just needs to be expensive in general to write-lock an account that many bots / protocols / users want to use.

aeyakovenko commented 2 years ago

@jstarry it does. The issue with "invalid" transactions is that they are created against a stale state. It's a much better environment if senders who are sending against stale state can't increase fees on senders who are sending against valid state.

By “much better” is that it allows the developer/Program capture more of the fees. If the network fees are increased indiscriminately that takes away value capture from the layer above.

sleueth commented 2 years ago

@aeyakovenko - Good post, and interesting solution. I think it does solve the current problem. But I have a concern: I can imagine lots of valid transactions where an account will be marked as writable but whose data will not change. Any transaction that contains some sort of optional routing will have this feature. The base case:

I invoke a Program that sends funds to account[1] XOR account[2] based on the result of some simple function. In this case, you need to mark both accounts as writable in the tx, but the data will change on only one.

Perhaps the N parameter above can be chosen in a way that blocks high-speed bots and allows human users with this valid use case through. I think you're correct insofar as this is an elegant solution to our current bot problem.

But I'm concerned that this limits the application architecture space for any high-speed application. Once N is chosen, all high-speed applications that run above some baseline k*N frequency are somewhat limited in architecture. Any address marked as writable should always be written to. This translates to a loss in idempotency and idempotency is a crucial architectural attribute available to application developers.

Maybe this is all fine and worth it. And/or maybe this is a decent patch until a more robust solution comes along. Still, worth understanding the tradeoff.

aeyakovenko commented 2 years ago

@sleueth @aeyakovenko - Good post, and interesting solution. I think it does solve the current problem. But I have a concern: I can imagine lots of valid transactions where an account will be marked as writable but whose data will not change. Any transaction that contains some sort of optional routing will have this feature.

Why can't the router simulate ahead of time and only lock the right accounts? If it's using stale data, or wants a fast path, it needs to back off or pay a higher fee. Especially if its locking a ton of accounts. Think about it, all the accounts it locks but doesn't use prevent another valid user from using that program.

So under normal conditions, the compute limit would allow some of these go through without causing a spike in fees, but if there is a flood it will force the router to slow down, or simulate more accurately. I am not sure the cap should be set at 25% or 50%, but at some level such that we can segregate optimistic and failed traffic from accurate.

This translates to a loss in idempotency and idempotency is a crucial architectural attribute available to application developers.

100%, but it's not "nonces". Bumping the fee just more accurately shares resources on the network.

sleueth commented 2 years ago

@aeyakovenko I see, yes, you can choose to make the accounts writable before submitting the tx by running a simulation first. Reasonable solution. This still limits throughput insofar as you're dealing with maybe stale information by the time your tx executes. But I do agree that it's probably required... any finite resource (here, writable accounts) needs some fee to discourage effectively squatting on accounts.

So never mind me, carry on! Thank you as always for building!

jstarry commented 2 years ago

What does "signing state" mean?

aeyakovenko commented 2 years ago

What does "signing state" mean?

changed to "if the senders are signing a transaction that is simulated against recent state."

buffalu commented 2 years ago

my rambling comment...

My initial thought is to KISS. Hotter accounts + write locks -> higher fees. It's almost like account based gas fees.

The "weight" of a block is essentially proportionate to the number of entries + shreds it produces along with the compute used, so it seems like there should be a higher charge for anything that causes that. Perhaps better packing transactions could help. On that note, perhaps trying to pack TXs looking outside of the PACKETS_PER_BATCH would help get less-hot accounts process faster and would kinda cause account-based queues inside the banking stage? need to think more if ideal property or not.

The thing I need to think about more wrt simulation in the aggregator case is MEV. If I see aggregator TX but know it's only going to hit exchange X, can I sandwich them easier? If they're hitting multiple exchanges, that might become a little harder, but I supposed the sandwicher can simulate and know the outcome either way.

We still need dynamic global fees on write locks. But, the huge advantage to this, is that it’s possible for a program to build its own congestion control and capture fees. A market can track the recent volume of trades and increase its own fees on small txs. Bots can’t avoid these fees via simulation, and they can’t spam optimistically looking for a cheap trade and fail.

can you explain how this would work? a market validators set or enforced app/protocol side? how would this be enforced? i guess dapps want a good experience for their users, validators want to make more money. need to balance that somehow.

@jstarry it does. The issue with "invalid" transactions is that they are created against a stale state. Eth forces every TX to update a nonce, which throttles bots. It's a much better environment if senders who are sending against stale state can't increase fees on senders who are sending against valid state.

By “much better” is that it allows the developer/Program capture more of the fees. If the network fees are increased indiscriminately that takes away value capture from the layer above.

for "binary" bots where outcome is 0 or 1, it seems like they could just mutate the state a tiny bit then you end up back at square 1 where you have the same spam + run against compute limit and need to increase fees.

Eth forces every TX to update a nonce, which throttles bots.

this is a non-problem for any bots; its how they prevent double spends (like blockhash). similar to nonce, any skilled bot writer could keep a deque of blockhashes and do multiples signs + spam.

i'd also want to see access patterns for these things - some data we'll hopefully have soon. these periods of degraded performance last several hours, but there are also probably quick bursts during volatility.

tl;dr: kiss. access patterns that degrade parallelism of system -> charge more.

buffalu commented 2 years ago

also, we've talked a bit about not even executing txs; just do sigverify + pack blocks.

in that case, it becomes a replay stage/tvu problem. seems like you come back to execution speed/parallelism in that case too.

ryoqun commented 2 years ago

here's my two cents:

some thoughts for the above proposal (= exponential increase fee for unused writable data accounts)

there's risk for normal users: victim's stalled transactions (now triggering slippage tolerance yet within recent_blockhash expiration) could be exploited for selfish validators right after 2^n-ing the fee.
not fan of burdening these congestion problem to higher level (i.e. dapp devs)
the inevitable N-slot-weighted triggering mechanism means some feedback latency to the bursts

instead how about introducing the merciless transaction execution mode, which takes advantage of parallelizable nature of error transactions (thanks to the state rollback/aborted execution from the very definition of tx).

so, when spam activity is going, leaders start to proactively try to execute transactions with the end of previous block state without write lock at the maximum concurrency. and filter out any failed transaction, to be packed into the newly gutter block entry later, which is multiplexed into turbine along side the normal entry shreds.

for successful txes from the proactive execution, leader re-executes normally with write lock with tip of the current block state and pack them into normal entries. until the tick is full, leader packs as much as possible these failed transactions (only with fee payer debiting) into gutter block entry while packing normal transactions into normal block entry.

pros

less complex then the above idea? and generic-purpose and fee isn't fiddled dynamically.
gpu friendly?
pretty quick to react any spikes/peaks.
bots experience hefty tx fee burden because the cluster can eat all of 'em without serialized execution.

cons

normal transaction needs to be executed twice
bankless leader unfriendly
dapps should be written to fail first, rather than no-op success tx (this prevents some possibly valuable accounting?) risk: normal transaction is susceptible to false positive, if it fails at the parent block's state (otherwise it executes successfully at the tip of the state)
- (i think this is tolerable?)

aeyakovenko commented 2 years ago

@buffalu

Kinda rambling

what's confusing? i write like C engineer, structs at the top :)

can you explain how this would work? a market validators set or enforced app/protocol side? how would this be enforced? i guess dapps want a good experience for their users, validators want to make more money. need to balance that somehow.

Fees need to double when the block is full, like total gas is > average load for N blocks, fees double for the next 2*N blocks

for "binary" bots where outcome is 0 or 1, it seems like they could just mutate the state a tiny bit then you end up back at square 1 where you have the same spam + run against compute limit and need to increase fees.

so that program that takes no action will get fifo ordering on whatever lands. That's totally up to the dev. That's the point! Only way to avoid the platform fee is to force control to the program so a program like serum can do its own congestion control and capture value from bots.

tl;dr: kiss. access patterns that degrade parallelism of system -> charge more.

yea, write locks that are not necessary degrade system perf. That's what this does.

this is a non-problem for any bots; its how they prevent double spends (like blockhash). similar to nonce, any skilled bot writer could keep a deque of blockhashes and do multiples signs + spam.

yea, eth nonce thing is a bit of a non sequitur.

aeyakovenko commented 2 years ago

there's risk for normal users: victim's stalled transactions (now triggering slippage tolerance yet within recent_blockhash expiration) could be exploited for selfish validators right after 2^n-ing the fee.

They know the max they would pay, blockhash signs it, we would need to dump all the old blockhashes for that write account and have users resign. Which is why 8,16 seems reasonable retry time.

@ryoqun, your proposal is rather complex from the runtimes perspective. speculative execution that doesn't mutate state is hard. This is just messing with the fee governor.

The. main innovation here is that applications can now do their own congestion control. SRM can charge in SRM to bot traffic. more value captured into serum.

buffalu commented 2 years ago

The. main innovation here is that applications can now do their own congestion control. SRM can charge in SRM to bot traffic. more value captured into serum.

ah okay, so you're suggesting that on-chain apps will change fees and those fees go to platform instead of validator? very interesting, need to think about this more. would want to make sure incentives are aligned with validator operators for them to process less txs that don't allow them to earn as much from tx fees

do you have a rough formula for where you see tx fees being derived from? something like: tx_fee = {number of sigs} + {protocol write lock increasing fee} + {app congestion control}?

if a validator is running MEV software, does that change anything?

askibin commented 2 years ago

Important news and large price swings are often come with high market activity, not just bots, but legitimate broad activity. If some markets started increasing fees during such periods, they would risk losing users to competitors who don't do it. Not sure if someone values extra fees more than market share. This might apply to blockchain as a whole.

Maybe it makes sense to keep track of the most actively write-locked accounts (both succeeded and failed), let's say top-100 (I bet they represent 98% of the activity), and have an extra fee that is a linear function of account usage frequency. The extra fee is applied to every such account included in the transaction, but only if the transaction has been failed.

This way, there is no adverse fee effects and no need for additional transaction simulation step for well-behaving clients.

It wouldn't help against the intentional attack, though - someone could just send 1 lamport(or token) to the most active raydium pools hundred times per sec, cheap, and transactions succeed. As a second layer of defense, per account fees could be applied to all transactions if account usage frequency stays too high for too long. It will also force protocol developers to break down their stuff when it becomes too popular.

aeyakovenko commented 2 years ago

Important news and large price swings are often come with high market activity, not just bots, but legitimate broad activity. If some markets started increasing fees during such periods, they would risk losing users to competitors who don't do it. Not sure if someone values extra fees more than market share. This might apply to blockchain as a whole.

if it hits capacity limits, then the chain has to increase fees for that write lock aka dapp/market. so users will pay sol instead of what the program wants. giving control to the program means that they can rebate users, or holders, or do whatever they want.

Maybe it makes sense to keep track of the most actively write-locked accounts (both succeeded and failed), let's say top-100 (I bet they represent 98% of the activity), and have an extra fee that is a linear function of account usage frequency. The extra fee is applied to every such account included in the transaction, but only if the transaction has been failed.

That's effectively what this does. linear doesn't work to force bots to back off though. If the program wants to force a linear fee on usage it can do so because all the transactions that will land will succeed. All a well behaving bot/user has to do is correctly simulate against recent state.

jstarry commented 2 years ago

Important news and large price swings are often come with high market activity, not just bots, but legitimate broad activity. If some markets started increasing fees during such periods, they would risk losing users to competitors who don't do it. Not sure if someone values extra fees more than market share. This might apply to blockchain as a whole.

if it hits capacity limits, then the chain has to increase fees for that write lock aka dapp/market. so users will pay sol instead of what the program wants. giving control to the program means that they can rebate users, or holders, or do whatever they want.

There are a few issues with the current proposal (there's some overlap here with @ryoqun's comments above too):

It's difficult for programs to detect and monitor congestion themselves. Programs only have control over this matter if a transaction which invokes the program is actually processed. We can only process conflicting transactions sequentially so we need a way to inform the program of how much back pressure there actually is so they know the difference between perfectly tuned processing and heavily congested processing. Programs need to adjust fees quickly enough to prevent DOS attacks on programs from lasting too long but they are hamstrung by only getting info about congestion from what they are actually processing
It doesn't take read-locks into account. Legitimate transactions which are fairly incentivized for a given program can still be read-starved. For example, consider two popular programs which always write lock their own state but read lock their complement's state. These transactions cannot be processed sequentially and so this proposal doesn't give a good generic solution to this problem

Alternate Proposal

What I suggest is that we do the inverse of what you're proposing. Instead of increasing fees for unused writable accounts, always increase fees for used accounts but send a portion of the increased fees to the program so that they can do the rebate. This still gives programs a lot of control over how to incentivize proper behavior but at the same time gives the runtime a general approach for limiting the amount of contentious transactions.

To combat read-lock starvation and other scheduling issues, the runtime could track higher level heuristics about the transactions it has processed and the relationships between processed transactions and even include a proof about transactions it couldn't schedule.

t-nelson commented 2 years ago

As I understand this proposal, it's not intended to prevent contention on a given program/program-owned-resource, but rather to prevent this contention from spilling out as a negative externality on the rest of the cluster. Detractors here seem to be trying to solve the other problem?

aeyakovenko commented 2 years ago

It's difficult for programs to detect and monitor congestion themselves. Programs only have control over this matter if a transaction which invokes the program is actually processed.

this mechanism forces the bots that invoke the program that fail to back off, because all failures become exponentially more expensive. If it's possible for us to change for successful write locks, it means program logic is being run, so I am not sure how the suggested approach helps.

We can only process conflicting transactions sequentially so we need a way to inform the program of how much back pressure there actually is so they know the difference between perfectly tuned processing and heavily

TX can do a system call on the cost model for the Account, or it can do it sown metering. Trades per slot, Slots since liqudation/oracle update, etc...

. Programs need to adjust fees quickly enough to prevent DOS attacks on programs from lasting too long but they are hamstrung by only getting info about congestion from what they are actually processing

Yea, they can do this logic on their own. if an oracle update is expected every 8 slots, they can 10x the fees immediately.

It doesn't take read-locks into account. Legitimate transactions which are fairly incentivized for a given program can still be read-starved. For example, consider two popular programs which always write lock their own state but read lock their complement's state. These transactions cannot be processed sequentially and so this proposal doesn't give a good generic solution to this problem

We have a ton more flexibility in read lock contention. There is no promise that a write isn't inserted before or after any sequence of reads.

What I suggest is that we do the inverse of what you're proposing. Instead of increasing fees for unused writable accounts, always increase fees for used accounts but send a portion of the increased fees to the program so that they can do the rebate

how do we know how much to increase fees by, how fast to do it and how fast to back off? what if it's not enough for oracles, but too much for traders? Also, the program can control what token the fees are implemented in, and may not want to distribute sol to its users, may want to distribute the market token, or the projects token instead.

With the "failed policy" I feel like we can be fairly aggressive in how fast they are forced to back off, and setting the cap to 25% of the total gives the program a guarantee that some non failed TXs land to talk to it.

ryoqun commented 2 years ago

@ryoqun, your proposal is rather complex from the runtimes perspective. speculative execution that doesn't mutate state is hard.

@aeyakovenko we can just use the same mechanism used by the simulateTransaction rpc method.

put differently, my proposal is exceptionally legalizing including many (probably bot-initiated) erroneous transactions more cheaply, if the transactions fail with the same (frozen) parent bank state. so this should avoid write_lock contention to execute them at once and state mutation is irrelevant (only fee debiting must be done, specially, also exponentially if solana want to punish hard). this is only triggered when under cluster-wide congestion.

This is just messing with the fee governor.

yeah, but shifting this onto dapps?

The. main innovation here is that applications can now do their own congestion control. SRM can charge in SRM to bot traffic. more value captured into serum.

needless to say, this is cool if done correctly.

i'm a bit skeptical for whether being able to do this right on-chain or not. i'm too naive. i just want to retain solana's simple selling point that is is infinitely fixed-cost/fast/cheap from the dapps perspective.

ryoqun commented 2 years ago

As I understand this proposal, it's not intended to prevent contention on a given program/program-owned-resource, but rather to prevent this contention from spilling out as a negative externality on the rest of the cluster. Detractors here seem to be trying to solve the other problem?

haha, good call. in that respect, how each competing daap is incentivized to implement their own congestion mechanism properly (without disadvantage to others)? it's like any given service (here congestion controll) must be done from private/public sector in politics. xD

ryoqun commented 2 years ago

also, if targeted program implements any such system, how to prevent bot transaction's calling program from peeking the state and avoid CPI if the cost is too high? then, botters can be rest assured and hit the cluster as much as possible?

jstarry commented 2 years ago

also, if targeted program implements any such system, how to prevent bot transaction's calling program from peeking the state and avoid CPI if the cost is too high? then, botters can be rest assured and hit the cluster as much as possible?

I think this is actually the main issue that @aeyakovenko brought up in the issue description. If bots want to conditionally CPI into a program, they still need to write lock the state. If they don't end up doing the CPI because the conditions aren't right, they will still be penalized for not writing to any state accounts owned by the target protocol.

As I understand this proposal, it's not intended to prevent contention on a given program/program-owned-resource, but rather to prevent this contention from spilling out as a negative externality on the rest of the cluster. Detractors here seem to be trying to solve the other problem?

Thanks for this @t-nelson, sorry if I've detracted from the specific problem this issue is trying to solve. Can you elaborate on what you meant by the issue of contention "spilling out as a negative externality on the rest of the cluster"? I don't really understand.

In my own words, I would describe the current issue to be a lack of penalty to write locking state that you don't actually mutate. As long as there is no way for programs to balance usage / availability of their state either directly or indirectly, they could be subject to congestion.

this mechanism forces the bots that invoke the program that fail to back off, because all failures become exponentially more expensive. If it's possible for us to change for successful write locks, it means program logic is being run, so I am not sure how the suggested approach helps.

The current proposal incentivizes bots to change their tx behavior to 1) have their transaction succeed and 2) always make some change to the writable accounts they locked to avoid the unused account penalty. Once they figure out a cost effective way to do that, they can carry on spamming the program without any fee increases. Of course, the program can observe the contentious behavior and modify the protocol to prevent any cheap mutations to state.

We have a ton more flexibility in read lock contention. There is no promise that a write isn't inserted before or after any sequence of reads.

If this is the case, then I'm onboard. But doesn't this imply that we can interleave reads with writes? If the max compute units are reached for a single writable account doesn't that imply that access to that state in a block is exhausted? Is the solution allowing transactions to read from stale state to avoid conflicts with writes? (I'm happy to draft up a separate proposal for this, if so)

TX can do a system call on the cost model for the Account, or it can do its own metering. Trades per slot, Slots since liquidation/oracle update, etc...

Cool, a system call like that would probably do the trick.

how do we know how much to increase fees by, how fast to do it and how fast to back off? what if it's not enough for oracles, but too much for traders?

Good points here, maybe we should give even more control to the programs here so that they can set fees directly for the state they manage?

Also, the program can control what token the fees are implemented in, and may not want to distribute sol to its users, may want to distribute the market token, or the projects token instead.

With the "failed policy" I feel like we can be fairly aggressive in how fast they are forced to back off, and setting the cap to 25% of the total gives the program a guarantee that some non failed TXs land to talk to it.

The failed policy is nice and simple but maybe not granular enough and too complicate and burdensome to protocols to tune correctly. I get that you want protocols to be able to charge fees in their preferred tokens but what if we didn't do that and just let programs directly set the read or write cost of each account in SOL? I think this would help protocols separate business logic from congestion control because it would be more easily tunable and automatically enforced by the runtime. If those lock fees go back to the program, the program could then rebate fees in whichever currency they want to the user.

aeyakovenko commented 2 years ago

@jstarry @taozhu-chicago given this model, I think what makes the most sense is prioritization of write locks > read locks when the cost model has an option. But we need to think about it. I think the optimal would be to put all the reads for a contentious write lock at the start of the block. So the rest of the block the writes go without interruption.

Write A = 100 CUs Write B read A = B should inherit 100 CUs

aeyakovenko commented 2 years ago

The failed policy is nice and simple but maybe not granular enough and too complicate and burdensome to protocols to tune correctly

@jstarry why? programs now can assume that all transactions succeed, and can do whatever. Control flow is purely in the programs hands.

I get that you want protocols to be able to charge fees in their preferred tokens but what if we didn't do that and just let programs directly set the read or write cost of each account in SOL?

Why is that better though? How would a program know how much to set read costs and how would it be able to do so if attacker is spamming the account with failed writes? There is a ton of simplicity gained for the program developer if the assumption they operate under is that all txs that call the program succeed.

mschneider commented 2 years ago

Increasing fees on the dex level poses severe challenges for traders. Can't arb well if you can't estimate fees. Can be circumvented by using an on-chain program (with the exact behaviour we want to prevent), but now we just made it 100x harder to start trading arb strategies on the dex.

aeyakovenko commented 2 years ago

@jstarry i really like the idea of depositing the sol into the writable account that is being spammed

aeyakovenko commented 2 years ago

@mschneider It wouldn't be an issue under normal operation. fees only go up if the account is being writelocked with failure > 25% of the total compute budget (maybe that should be 50%). ARBs shouldn't be optimistically spamming the network to the point of eating up normal users.

mschneider commented 2 years ago

Arbs will place ioc limit orders which will produce failed transactions in fast moving markets, especially if there are multiple takers competing for liquidity. You might just produce a lot of failed tx, because someone else beat you to place an order. We see a very similar dynamic with liquidations. You can call it optimistic spamming, or just simply the result of 20 bots trying to execute the same operation, when only the first one will succeed.

Will a dynamic congestion based fee model incentivize those traders to stop posting more orders, yes w/o doubt.

For me the real questions here are:

1) Should we punish them for being slow and increase the gap to the high speed traders even more?

2) Should we ask them to build complex software to be able to account for dynamic fees and make it even harder to start trading?

askibin commented 2 years ago

@mschneider, if bots can spam the network to the point that it is unusable to anyone, something should be done about it. I think the proposed solution can address that - unused writable accounts mean you are either using a stale state or your tolerance to change too low, and you are likely a bot. But I would prefer seeing bots pay the full price for unused accounts instead of charging everyone and then hoping for a rebate mechanism to work. I think there is no way to avoid dynamic fees in general unless they are always high, so bot builders will need to adjust and write more responsible software.

aeyakovenko commented 2 years ago

@mschneider In this proposal, there is some capacity for failure, and we are always working on increasing the capacity. So as the software gets better and better hardware is deployed on the network there is more room for error.

tomland123 commented 2 years ago

Hello, the main problem I see with this proposal is that it doesn't solve any long term problem and just adds complexity for devs everywhere in the ecosystem.

Right now there is still only about 10-15 protocols on solana that are actively botted. In twelve months, there will likely be a hundred that have similar usage to the ones today and those 10 protocols will also have more users. In two years, if all goes well, that number will likely be a thousand bottable protocols. So let's say you increase the fees exponentially on a couple of writeable accounts--if you have thousands of protocols that are doing/strive to do huge volume, did any solution actually happen or did we just increase the complexity for devs everywhere?

Moreover, what engineering design does this fee change incentivize (if its truly exponentially more expensive in a meaningful way)?

It will encourage builders to deploy as many accounts as possible in order to get around this fee structure and reclaim them later.

I also question the notion that this hurts bots more than normal users. Bots dont use free rpc nodes and rarely have a stale state. Normal users do and if they are paying 100x fees because the website told them to use genesysgo its just not a good look and will hurt adoption significantly.

aeyakovenko commented 2 years ago

did any solution actually happen or did we just increase the complexity for devs everywhere?

The solution is described in the proposal. users that correctly simulate transactions prior to sending them are unaffected, bots that flood a single market have to pay higher fees.

It will encourage builders to deploy as many accounts as possible in order to get around this fee structure and reclaim them later.

That would maximize parallelism in the chain, which would allow more transactions to go through. so a good thing.

I also question the notion that this hurts bots more than normal users. Bots dont use free rpc nodes and rarely have a stale state. Normal users do and if they are paying 100x fees because the website told them to use genesysgo its just not a good look and will hurt adoption significantly.

Where do you see anything to do with RPCs in the proposal? It doesn't matter where the transaction is coming from

tomland123 commented 2 years ago

That would maximize parallelism in the chain, which would allow more transactions to go through. so a good thing.

I think you are potentially underestimating the second order effects that will happen if people start creating thousands of accounts every two weeks all at the same time.

Moreover, I feel pretty strongly that it should be the DApp developers responsibility to write good software rather than to add more complexity to Solana.

For example, we are all making an assumption that an orderbook like Serum needs to use SPL tokens, however because of that assumption many of their interactions are significantly more expensive than they theoretically need to be.

Same with Raydium. Users can IDO in a number of ways (a lot of people have been using the IDO pool that donderper wrote), but they choose to use Raydium because the bot spam is very hype and devs love the instant gratification. There are no write-lock complaints with the IDO pool.

Mango write lock global accounts and that just blocks everything in their code if its invalid. They could optimize this more and most of their issues would just disappear.

The idea that we are awarding protocols for writing less than ideal software at the cost of everyone else also makes me upset and it doesnt strike me as a great fix because its either going to be completely useless after its deployed and every dev will be forced to support this logic for all of eternity despite it being obsolete. Or just as bad, this will be so effective that everyone will try to optimize how to make money from this at the cost of the ecosystem. And at this point we didnt create a long term solution, but rather a quick hack that has made everyone work significantly harder.

I agree with Jstarrys original suggestion, if for example, you just charged 5x more transaction fees to use protocol X more then protocol Y because protocol Y uses 1/5th of the writable accounts, I think a lot of the congestion would disappear naturally and it would incentivize devs to write more aligned software that work in Solana. I dont know if this change by itself is enough, but it would be a small change in what I feel is the right direction.

Where do you see anything to do with RPCs in the proposal? It doesn't matter where the transaction is coming from

RPC nodes that simulate a stale block hash and then send a faulty txn to the leader node with stale data? I don't understand this assumption where RPC nodes work during congestion. Public ones are hit much harder than private ones.

The solution is described in the proposal. users that correctly simulate transactions prior to sending them are unaffected, bots that flood a single market have to pay higher fees.

I dont want to publicly share why I feel this assumption is wrong. But I feel this is naive.

t-nelson commented 2 years ago

@tomland123 I think you are here :point_down:

As I understand this proposal, it's not intended to prevent contention on a given program/program-owned-resource, but rather to prevent this contention from spilling out as a negative externality on the rest of the cluster. Detractors here seem to be trying to solve the other problem?

This isn't a one or the other thing. Both problems need to be solved eventually. This proposal doesn't reward dapps for writing bad code. It attempts to isolate their mess so the rest of the cluster doesn't have to deal with it. Whether they want to fix their code or charge a fee is up to them. If the experience is bad, some one will build a competing product and eat their lunch.

tomland123 commented 2 years ago

Just to be clear my detraction is that this complicate things a lot for devs and has spill-over for things that might cause huge problems which are being covered up right now because there is no financial benefit to do those things. The only reason I mentioned the other thing is because I think small improvements to incentivize devs to build in alignment with the network will make this problem go away naturally (if I am wrong then I have no issues with this change though).

Moreover, I cant imagine market making on foresight without knowing what i have to pay for a txn.

t-nelson commented 2 years ago

It does nothing to the dapp developer other than reduce their ability to blame the chain for poor contract performance. Whether/how they want to solve that problem is up to them.

tomland123 commented 2 years ago

yes it affects dapp developers. A dapp going from 2000 accounts to 200,000 accounts so that botters can avoid paying fees is a huge increase that most dapps are not prepared to handle.

t-nelson commented 2 years ago

It's not the fee-payer account that's going to get rate limited at the protocol level. It's the market accounts, order books, event queues, etc which the dapp dev fully controls

tomland123 commented 2 years ago

there are usually only two or three different bots spamming right now per market at a time so I am questioning that assumption but maybe it would only increase to 100, instead of 20000 idk it depends on the implementation I guess. Really depends on how the dapp was built too

t-nelson commented 2 years ago

All of these get hot according to this proposal

Screenshot from 2022-01-19 00-49-30

tomland123 commented 2 years ago

OK. I like the change

aeyakovenko commented 2 years ago

@tomland123

Mango write lock global accounts and that just blocks everything in their code if its invalid. They could optimize this more and most of their issues would just disappear.

The problem is that Mango can't defend against an arb bot that takes write locks on mango accounts and exits without giving control to the mango program. There isn't much mango can do to fix this form of starvation.

tomland123 commented 2 years ago

The problem is that Mango can't defend against an arb bot that takes write locks on mango accounts and exits without giving control to the mango program. There isn't much mango can do to fix this form of starvation.

I dont understand this. If they made it so that write locks only affected users after fifteen minutes, you would need to spam very hard for 15 straight minutes and pray that none of the mango txn get through. This seems to be a pretty good defense I think? But like I said, I do like the change and it will allow for faster software.

t-nelson commented 2 years ago

I dont understand this. If they made it so that write locks only affected users after fifteen minutes, you would need to spam very hard for 15 straight minutes and pray that none of the mango txn get through. This seems to be a pretty good defense I think? But like I said, I do like the change and it will allow for faster software.

Dapp developers have no control over write locks taken on their accounts, no guarantees that write locks on accounts owned by their program are being taken by transactions that reference their program's instructions and no ability to prevent later instructions in the transaction from intentionally failing based on unfavorable outcomes. This must be implemented in the runtime

buffalu commented 2 years ago

it seems to make sense to me to do fees/gas similar to ETH with the solana spice on top.

you have max gas usage (compute) + then gas price (priority) as new parameters in message.

max gas usage determined by running tx simulation with additional overhead (10-20%, handled by wallet automatically). if max gas usage too low at runtime, tx fails but all fees subtracted.

the extra spice in this case would be something relating to account locking + contention + whatever else is deemed necessary to accurately price blockspace.

dafyddd commented 2 years ago

I wrote this in a thread elsewhere and I'm copying over here: Right now you can just access all the most popular writable accounts in one tx and essentially make the network single threaded. That should be very expensive to do. Solana should charge tx fees based on how often an account is read from or written to. There should be a multiplier for accessing a popular account as writable over readonly. The price for each account can be adjusted dynamically with some pricing formula (maybe some EMA of usage per slot? pricer should be exponential and not linear i think). Then compute used in the tx can be multiplied by the sum of the base charges for each account to give a full charge for the tx.

e.g. I send a tx using accounts A, B as read only and C as writable. The pricer says A = 1, B = 4 and C = 2 and writable has a 5x multplier. And my tx costs 10k compute. So total cost = (1 + 4 + 10) * 10k = 150k lamports. Then you can update the pricer for each account.

I don't know how feasible it is to implement my idea but I like it for a few reasons:

It is still somewhat deterministic on fee pricing. Nobody likes choosing how much gas they want to pay. The mental complexity there really sucks
It comes from the principle that you should be charged the larger your externality to the network. The more popular a resource and the longer you hold it, the more you're charged. The intuition is that someone taking all Serum books, raydium, orca, mango and drift accoutns into one tx and doing complex calculations for 1m compute is doing a lot more damage than someone doing 1m compute on one writable account. This also protects against the problem @mschneider was talking about where 20 liquidators all go for a liquidation and the first guy gets it. The other 19 liqors won't pay much because they'll exit before using much compute.
It incentivizes devs to develop more parallelizable and less spammy mechanisms. Right now, devs are handing out economic value to people who spam a lot and are first. This leads to a lot of congestion that affects other network participants but is still profitable to the spammer. I have a theory that what you really need is distribution of writable accounts to be varied enough that entries created by leader are large.

NorbertBodziony commented 2 years ago

e.g. I send a tx using accounts A, B as read only and C as writable. The pricer says A = 1, B = 4 and C = 2 and writable has a 5x multplier. And my tx costs 10k compute. So total cost = (1 + 4 + 10) * 10k = 150k lamports. Then you can update the pricer for each account.

I wonder how impactful can this pricing model be since it will turn validators into big Knapsack problem solvers. With large quantities of accounts and potentially each have different cost we might end up wasting performance looking for potential Pareto points.

IMO both charging for write access or unused write access are fine solutions and rather simple to implement. More elaborate ways of pricing might be better, but it seems like we kinda need a solution sooner than later.

aeyakovenko commented 2 years ago

@dafyddd

wdyt of

Market continuously sets the write_lock_fee to the average additional_fee over the last 100 slots (40 seconds)

this will automatically increase the fees when account is congested.

solana-labs / solana