ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.16k stars 5.75k forks source link

Support for immutable in function parameters #12303

Closed Nielsbishere closed 2 years ago

Nielsbishere commented 2 years ago

Abstract

Consider something like EIP-2535 (Diamonds) that allows for swapping out functions for upgrades. This introduces some trust required from the user to make sure functions stay the same as they were before (or at least don't turn malicious). Without immutable as a variable modifier that's hard because every storage or memory variable passed can be modified by it; regardless of if it's needed. I think reducing this possible attack surface could be good for upgradable contracts.

If I'm missing a keyword or solution that does this already, I'd be glad to hear about it.

Motivation

function mySwappableCheckFunction(System storage system, uint256 userId) external returns(bool) {
  //Do good stuff, we're only reading system here, not modifying
  return ...;
}

Let's consider the example above, but assume that somehow the owner intentionally or unintentionally swapped out this function for a malicious one;

function mySwappableCheckFunction(System storage system, uint256 userId) external returns(bool) {
   //Do evil stuff, We can do anything we want to system; all members are exposed
   return ...;
}

This would mean that even though mySwappableCheckFunction only reads from system, it will still maintain the access to it. Meaning that intentionally or not someone can write to it. Marking this as immutable would solve the problem while also adding the nice syntax sugar of not allowing modification to something that you indicated shouldn't be modified.

Solution:

function mySwappableCheckFunction(System immutable storage system, uint256 userId) external returns(bool) {
   //My evil function is now limited, it can only return bad results; 
   //We can't modify system here, so our exploit surface is limited
   return ...;
}

Specification

// SPDX-License-Identifier: MIT
pragma solidity >=0.8.9;

contract Test {

    function test(uint256 immutable myTest) external returns(bool) {
        myTest = 1;
        return myTest == 0;
    }

}

Should not compile, while not assigning to myTest should. I'm not sure about the EVM if it should block this access at bytecode level, but I think having support for this at Solidity level is already very nice (since the user can verify generated bytecode with the open source version to see if it has been altered).

Backwards Compatibility

N/A. Older versions would just remain as they are

cameel commented 2 years ago

Related issue: #715.

I agree that some kind of const mechanism for parameters/variables is something we should really have in the language. The devil lies in the details :)

Some general remarks:

Overall, this issue is very underspecified. There are a lot of important questions that need to be answered to make it implementable:

  1. Can you pass a mutable variable into an immutable parameter? Or the other way around?
    • If you can't then what about built-in functions? Should they ignore mutability?
    • Should it depend on whether the parameter is a copy (i.e. will be ABI-encoded or is a value type) or a reference? Note that the same parameter of the same function can sometimes be both, depending on how the function is called (i.e. public function called internally vs externally).
  2. Should there be a mechanism to convert an immutable value back to a mutable one? Something like const_cast<> in C++?
  3. For memory/storage references is the reference itself immutable or only the value it points at? Or both?
  4. Is the parameter always immutable as a whole or can you have e.g. an array of immutable structs? Can a mapping have a mutable key and an immutable value?
  5. Can you override a virtual function having a mutable parameter with one that has an immutable one? Similar to how you can for example override a view function with pure one.
  6. Can you overload functions on immutability? I.e. can you define both foo(uint) and foo(uint immutable) and have the compiler choose the right one?
    • If you can, immutability will have to be encoded in function signature so it will affect the ABI and interfaces.
  7. Can you do using ... for uint immutable to get a library function attached only to immutable values?
  8. Is a function with an immutable parameter implicitly convertible to a function type with a mutable one? Or the other way around?
  9. Can modifiers have immutable parameters?
  10. Can return variables be immutable?
  11. Can you have immutable local variables? Can you assign immutable parameters to mutable local variables?
    • If you want have them, we'll need to add support for dynamic array (#11879) and struct (#4037) literals first. Otherwise you won't be able to initialize such variables.
  12. Can you have immutable state variables (in your sense, i.e. not modifiable but not embedded in the bytecode)? Or can you assign immutable parameters to mutable state variables?
  13. Can errors and events have immutable parameters? Should mutability be ignored when passing values to errors/events?
  14. Should we disallow mutable calldata parameters since calldata is inherently immutable? Or should they implicitly get the same semantics even without the keyword?
Nielsbishere commented 2 years ago

@cameel interesting points, you're absolutely right about sstore/mstore. In OSes this is generally solved by something like VirtualProtect, which allows specifying ranges as read/write. Though something like that would require EVM changes that are probably too big of a change, but I don't see a good alternative (something like PushProtectRange and PopProtectRange). I thought about a "noassembly" function modifier that'd disallow inline assembly calls to prevent people from mstore/sstore, but that won't prevent people from doing it anyways and pretending it's generated with the noassembly function modifier. When resolving the function; params constness should be factored in, to make sure that if it's swapped someone can't remove them (so it's part of the function signature). I'd say that this applies to memory, storage and calldata (always const). Allowing this on ints/uints might be neat but personally I don't see the use for this.

  1. I'd say you can pass non-const and const variables to a const parameter, but you can't remove const-ness. For backwards compatibility this would mean that functions taking memory or storage would be non const, thus not invokable with const params.
  2. I think that kind of defeats the point of this as a security feature. In C++ it mostly exists as a way to save on code afaik.
  3. I think it should apply to the memory/storage and be prefixed (so uint256[] const memory myVal). For uints/ints it's nice syntax sugar but imo it doesn't add too much, since they're copied anyways. If this was added anyways, it'd probably look like uint256[] const memory const myVal if both the memory and the variable itself are const.
  4. Is definitely something to discuss about further. In one of my projects I want function extensibility but don't want the security hole of diamonds. My solution for this would be to pass a const version and two other params from the same struct that can be modified. However I think this could be solved in a better way, though I'm not completely sure how.
  5. Since I think constness would be part of the function signature, I think inheritance should implement it the same way. So function test(string const memory a) should be overriden like that and const can't be added or removed.
  6. Since it'd be part of the function signature, it technically could but I'd advise against it, since it could be error prone.
  7. Sounds good.
  8. Non const could be implicitly casted to const imo.
  9. Modifiers can have const as memory or storage specifier.
  10. Only on memory/storage variables that are returned.
  11. I think that'd be good and I think those two issues are important anyways. I've run into the problem that I can't C-style initialize my dynamic arrays multiple times (so much more practical).
  12. I think since there is already a keyword for that, there's no use in specifying const on state/member variables. Only locals, params and returns.
  13. Can errors/events have memory/storage params passed to it? Regardless I'd say constness would be ignored, since events are always constant and don't risk modification to the state.
  14. I'd say it could be good practice to specify const to calldata, since it indicates to new developers that it can't be modified. Calldata does indicate that, but it might not be obvious from the start.
cameel commented 2 years ago

Thanks for a detailed answer. This gives a much better picture of what you're thinking of. I was hoping it was more about a general concept of constness of parameters but looks like do mostly care about protecting specific areas of memory/storage.

I have more to say on specific points but before we get deep into that maybe let's talk about the most important point - security - because I really don't think this can ever work as a security feature without support at EVM level. Can you give a more detailed scenario where this would protect your contract? It seems to me that you're thinking about it in the context of a proxy that makes a delegatecall to a contract that it doesn't fully trust? In such a scenario the bottom line is that any syntax-level protection can be defeated because the contract you delegate to does not have to be written in Solidity at all. It can be written in another language or even hand-crafted in assembly/Yul giving full access to the delegated storage.

Also, even if EVM had something like PushProtectRange, protecting specific ranges is very problematic due to how the compiler allocates storage. Value types are easy because they're stored contiguously starting at zero. Problem starts when you add reference types. A dynamically-sized array only has one slot in the contiguous range (length is stored there) and the actual data is at a "random" storage location (computed by hashing the position of the length slot). A dynamic array of dynamic arrays will have every sub-array at a different "random" storage location. And, worst of all, a mapping not only spreads its values all over storage but also does not store the keys so the compiler cannot even tell where they are. Trying to protect a deeply nested dynamic array would be expensive and in case of mapping it would be simply impossible.

So really, I only see any const feature as a good practice that lets you detect honest bugs but not as something that will let you protect your storage/memory against code you do not trust.

Nielsbishere commented 2 years ago

@cameel That's a shame. Basically the idea would be that this would give more of a guarantee to users that functions can't randomly be swapped out to something that can for example steal their funds; providing a better trust between devs and users. If for example only 2 out of all variables have to be modified, then exposing the entire storage struct as readwrite could mean that if the dev was either hacked or willingly wanted to steal users funds, they could. Or if multiple structs are passed and only one should be modified. But I understand that it's not really possible, perhaps this should be fixed at the contract implementation instead. For example by having a version id and letting the user specify a version id that resolves the function and not allowing functions by version id to be changed, only new ones to be added.

mudgen commented 2 years ago

@Nielsbishere

There are ways to minimize security issues or trust with upgradeable contracts. Here are some ways:

  1. Make upgrades only possible with on-chain votes by users or governance token holders. DerivaDEX does this. https://medium.com/derivadex/the-diamond-standard-a-new-paradigm-for-upgradeability-569121a08954
  2. It is possible to use a multisignature contract to do upgrades. This makes approvals from multiple people needed for an upgrade to occur.
  3. Make all upgrades transparent and recorded. The diamond helps this by emitting an event about every upgrade.
  4. It is possible to remove the ability to upgrade a diamond by removing the diamondCut function. So the ability to upgrade can be removed after some time has passed.
  5. Upgrades can be time-locked. Meaning an upgrade can be submitted to a diamond but only go into effect after some set time has passed. This give people time to decide whether they still want to trust and use the contract/diamond.
  6. You can make it so that upgrades are only possible after certain intervals of time. For example a diamond can only be upgraded during 1 day every six months.
mudgen commented 2 years ago

@Nielsbishere

perhaps this should be fixed at the contract implementation instead. For example by having a version id and letting the user specify a version id that resolves the function and not allowing functions by version id to be changed, only new ones to be added.

This does not solve the problem. If new functions can be added then new functions can be added that can manipulate any of the state in the contract. For example a new function can be added that steals all users` funds.

Nielsbishere commented 2 years ago

@mudgen

  1. Governance through tokens is a debated topic though, because wealthy investors can just buy everything and rule it. But I don't see a good way to do this without requiring KYC (and even that could be manipulated).

The rest sounds good and you're right about the adding new functions thing. It's a bit complicated to wrap my head around making it in a way that doesn't expose everything to new functions. Can't I copy something from storage to memory and then pass it to a function if it doesn't need to read it? Or is copying heavy or impossible.

Because otherwise I could make it that it's not invoked through the fallback function but two functions that invoke an address you've added. One is for modifying critical data and the other for only reading it and modifying some other data that isn't as sensitive.

mudgen commented 2 years ago

@Nielsbishere

Can't I copy something from storage to memory and then pass it to a function if it doesn't need to read it? Or is copying heavy or impossible.

I'm not really understanding what the point of this is. Can you clarify?


Because otherwise I could make it that it's not invoked through the fallback function but two functions that invoke an address you've added. One is for modifying critical data and the other for only reading it and modifying some other data that isn't as sensitive.

With that you made me think of a way to have a diamond with immutable functions that read/write state that can't be messed with by upgradeable functions or new functions. This is how:

  1. Add immutable functions directly to the diamond proxy contract. This is so that it is not possible to remove or replace these functions. The EIP-2535 standard allows doing this. But remember to emit these in the DiamondCut event and include them in what the Loupe functions return.
  2. Create and deploy a second contract which is where immutable function state variables are stored. Add state variables and setter functions and getter functions to this contract and add authentication to the setter functions so that only the immutable functions from the diamond proxy can call them. Anyone or any contract can call the getter functions.

That's it. Pretty simple.

With this strategy it is possible for a diamond to have immutable functions with state variables that can only be modified by the immutable functions, and at the same time have upgradeable functions and have the ability to add new functions.

Nielsbishere commented 2 years ago

@mudgen the point is basically not passing a storage struct but instead a memory struct, so if modifications are done it won't affect anything on chain.

That sounds pretty good, I'll look into it.

mudgen commented 2 years ago

@Nielsbishere Okay, thanks.

mudgen commented 2 years ago

Governance through tokens is a debated topic though, because wealthy investors can just buy everything and rule it. But I don't see a good way to do this without requiring KYC (and even that could be manipulated).

That's a good point. Thanks.

cameel commented 2 years ago

As I stated above, it's not really possible for us to implement the feature with desired properties so I'm going to close this. Feel free to further discuss the idea (and other ideas on how to extent the language) on the forum.