Using SipHash with private node key

gorgos commented 2 years ago

I came across this section in the documentation:

Solidity takes the keccak 256 hash of the key and the storage slot, and simply uses that to find the entry. There are no hash collision chains. This scheme is simple and avoids “hash flooding” attacks where the attacker chooses data which hashes to the same hash collision chain, making the hash table very slow; it will behave like a linked list.

In order to implement mappings in memory, a new scheme must be found which avoids this attack. Usually this is done with SipHash, but this cannot be used in smart contracts since there is no place to store secrets. Collision chains are needed since memory has a much smaller address space than the 256 bit storage slots.

Any suggestions for solving this are very welcome!

But it makes me wonder, why not use SipHash and have each node create its own private key? As long as they process transactions from the beginning without changing this private key, that should work, right?

Maybe I'm missing something though.

seanyoung commented 2 years ago

Hi @gorgos thanks for that.

As I understand it, a smart contract runs on-chain, and it is stored on the chain (on solana, substrate, and ethereum at least). The program code and storage is completely public. There is no place to store a private key.

What do you mean by node?

gorgos commented 2 years ago

@seanyoung Yeah, I mean don't store it in the smart contract, but fundamentally all code in a smart contract will be executed on a node, usually a validator for staking blockchains which is a computer that's participating in the consensus mechanism. Each node can have its own private key for usage with SipHash making “hash flooding” attacks impossible, because an attacker won't know where each node hashes the values to.

One effect of this would be that each node hashes the value to a different spot, since they all have different private keys, but that shouldn't be a problem.

seanyoung commented 2 years ago

@gorgos this would require that the blockchain node provides such a hash datastructure, but no blockchain provides that. solang is a compiler and not a blockchain.

Now you could propose this datastructure for each blockchain, but this is unlikely to succeed.

gorgos commented 2 years ago

@seanyoung How are the blockchains protecting themselves from hash flooding in their native languages?

seanyoung commented 2 years ago

As far as I know, they just avoid hash tables with collision chains.

gorgos commented 2 years ago

@seanyoung Then wouldn't those chains suffer the same flooding attack scenario even in their native language?

seanyoung commented 2 years ago

Can you give an example please?

gorgos commented 2 years ago

@seanyoung After speaking to @xlc from Acala, he mentioned that Substrate Wasm smart contracts are by default using on-chain storage on the native layer which is using the standard Rust hash table functionality that includes calculating a random value. In other words at least for Substrate is should be possible, not sure about the other chains.

seanyoung commented 2 years ago

Substrate/Ethereum storage is a key-value store, so in the case the problem is handled by the runtime. The problem description talks about mappings in memory, so that does not help there.

gorgos commented 2 years ago

Ah, didn't see that, mappings in memory should be small enough to not suffer from this I think.

gorgos commented 2 years ago

@seanyoung This is from the original Solidity docs:

Mappings can only have a data location of storage and thus are allowed for state variables, as storage reference types in functions, or as parameters for library functions. They cannot be used as parameters or return parameters of contract functions that are publicly visible. These restrictions are also true for arrays and structs that contain mappings.

Why does Solang require mappings in memory?

seanyoung commented 2 years ago

Why does Solang require mappings in memory?

Two reaons:

You might want to use temporary hashmap in solidity, and declare a mapping as a local variable.
Solana uses a blob of memory as account storage (up-to 10MiB). So Solana has the same problem for contract variables.

xlc commented 2 years ago

If the in memory hash map is small, the performance difference should be tolerable .

seanyoung commented 2 years ago

If the in memory hash map is small, the performance difference should be tolerable .

On the other hand, an adversary could try to create long collision chains as a denial-of-service attack. With 10MiB space this could be significant.

xlc commented 2 years ago

To the smart contract platform, all those execution time will be metered and charges gas accordingly so no issues here. To smart contract developers, it is their responsibility to ensure their own contract cannot be manipulated to create big hash map in memory.

seanyoung commented 2 years ago

@xlc many contracts like uniswapv2 allow hashmap entries to be created, where the key is controlled by a potential adversary. For example here: https://github.com/Uniswap/v2-core/blob/master/contracts/UniswapV2Factory.sol#L10

On Solana keys can be selected which would make the hashmap into a linked list.

Saying "To smart contract developers, it is their responsibility to ensure their own contract cannot be manipulated to create big hash map" is not a solution

xlc commented 2 years ago

How about using other map implementation instead of HashMap?

gorgos commented 2 years ago

@seanyoung Your referenced contract is a state (not a memory) hash map (as all mappings in Solidity for Ethereum are) and so it shouldn't be a problem, right?

seanyoung commented 2 years ago

@seanyoung Your referenced contract is a state (not a memory) hash map (as all mappings in Solidity for Ethereum are) and so it shouldn't be a problem, right?

On Solana state is stored in a single contiguous piece of memory; this is implemented so that it can be much faster than Ethereum (for myriad of reasons).

So for Solana we have a problem.

seanyoung commented 2 years ago

How about using other map implementation instead of HashMap?

Yes, we can use a BTreeMap instead. This is not as fast though.

gorgos commented 2 years ago

If it's only Solana, then my original idea might be worth considering and talking to Solana devs for support: https://github.com/hyperledger-labs/solang/issues/581#issuecomment-981054011

seanyoung commented 2 years ago

If it's only Solana, then my original idea might be worth considering and talking to Solana devs for support: #581 (comment)

What you're essentially proposing is a syscall on solana which functions like the key-value store on Ethereum. This would require a lot of infrastructure and also would come with quite some overhead. Simply using a BTreeMap would be better.

hyperledger-solang / solang

Using SipHash with private node key #581