steveluscher commented 2 years ago

📝 Note: Implementing #1 would pretty much solve this too, with some light tweaks.

Preamble

Consider this code that's designed to prevent attackers from spamming Octane's generous offer to sign and simulate a transaction:

https://github.com/solana-labs/octane/blob/master/src/api/transfer.ts#L23-L32

That code uses a module-local Set to track which source accounts have in-flight transactions:

https://github.com/solana-labs/octane/blob/master/src/api/transfer.ts#L7

Problem

Consecutive requests from an attacker are likely to hit the same thread/instance of the serverless function, but from what I understand this is not guaranteed.

From the Vercel docs:

For example, a Serverless Function handles an incoming request, runs some computation, and responds. If another request comes to the same path, the system will automatically spawn a new isolated function, thus scaling automatically.

In contrast, processes and containers tend to expose an entire server as their entrypoint. That server would then define code for handling many types of requests.

The code you specify for handling those requests would share one common context and state. Scaling becomes harder because it is difficult to decide how many concurrent requests the process can handle.

If consecutive requests by the attacker hit different threads running our serverless API function, they will each have different Sets of source ids, and will permit the simulation/broadcast on each thread, rendering the defence mechanism ineffective.

Possible solution

In general, with horizontally scalable systems of lambdas, you can rely on a shared state service to implement a mutex. We could consider building in support for something like Upstash – a Redis service. We could implement the source account lock as a [Redis distributed lock[(https://redis.io/topics/distlock).

https://docs.upstash.com/redis/howto/vercelintegration

jordaaash commented 2 years ago

Yeah, good points. It's true that it's not guaranteed to hit the same thread. We also probably lose these lockouts and the txids (#1) if the function becomes idle. The IP-based ratelimiting is subject to the same problem.

We could use an external store for all this persistent info. A downside is that it makes this more complex to deploy. Perhaps we can find a way to make it optional/modular with some kind of drop-in Store interface.

jordaaash commented 2 years ago

📝 Note: Implementing #1 would pretty much solve this too, with some light tweaks.

I'm not sure using txids solves this problem, because a spammer could pick a bunch of recent blockhashes that all target the same source token account and thus have different signatures. Maybe you have another solution in mind though?

A related issue: assuming there are several instances of Octane (let's say, run by different parties), a spammer could hit all of them and make them race for it, wasting gas for everyone but the winner.

steveluscher commented 2 years ago

I'm not sure using txids solves this problem, because a spammer could…

Oh yeah; I just meant that if we stood up a persistent store in #1 we could use it to store these locks.

Perhaps we can find a way to make it optional/modular with some kind of drop-in Store interface.

I like this. Optional, with a build time warning that you should really consider standing up a for-real persistent store.

I feel like there's a startup somewhere in here.

jordaaash commented 2 years ago

spammer could hit all of them and make them race

This is probably solvable with captchas, but I think that reduces the chance of getting integrated with wallets. In general it would be nice to not hurt the UX like that.

jordaaash commented 2 years ago

make it optional/modular with some kind of drop-in Store interface

Turns out https://github.com/BryanDonovan/node-cache-manager#store-engines already does this.

3 implements the default memory store:

https://github.com/solana-labs/octane/blob/8479af3e78a4a613006405c535ba052aaac5b2f7/src/core/cache.ts#L1-L3

We could provide config for the Redis store, maybe along with https://github.com/wyattjoh/rate-limit-redis, and add it to the docs or examples or something.

anza-xyz / octane

Mutex used to lockout spamming attacks is not likely to work as Vercel infra scales to handle more traffic #2

Preamble

Problem

Possible solution

3 implements the default memory store: