ethereum / fe

Emerging smart contract language for the Ethereum blockchain.
https://fe-lang.org
Other
1.62k stars 187 forks source link

Require taking self as argument for contract methods that interact with self #96

Open cburgdorf opened 4 years ago

cburgdorf commented 4 years ago

Proposed Change

Currently methods on contracts have implicit access to self as seen in the following snippet.

contract Something:
    x: u256 = 1
    pub def read_x() -> u256:
        return self.x

This issue proposes to explicitly require self as method argument when a method wants access to self:

contract Something:
    x: u256 = 1
    pub def read_x(self) -> u256:
        return self.x

Motivation

One of the core motivations is to make self less magical (where does it come from?, Why is it a special case?). However there are other compelling reasons why such a change would be beneficial.

Self-less methods as class methods

If a method does not interact with self, it can leave out self and becomes a static method of the contract.

contract Something:
    x: u256 = 1
    pub def does_not_read() -> u256:
        return 1 + 1

This is good for readability and auditing as it becomes easy to spot which methods do and do not interact with contract storage. Open Question: Would such a method still be callable as self.does_not_read() or only as Something.does_not_read() ?

Making mutability explicit

We can go one step further and say that self becomes immutable by default. If one wants to have a method that updates a storage variable, one has to explicitly take self with the mut keyword as shown below.

contract Something:
    x: u256 = 1
    pub def update_x(mut self) -> u256:
        self.x = 2

This again is great for readability and auditability. We can compare that to the inverse of view in solidity but I would argue that the nice aspect here is that immutability is the default and one has to add a keyword to make things mutable rather than to add a keyword to declare a method as being view only or pure.

pipermerriam commented 4 years ago

I REALLY like mut self and that entire mechanism for preventing state mutability.

I think it's worth exploring whether this can serve as re-entrancy protection as well. If you went the rust route and were able to enforce there only ever being one mutable handle to self then it at least seems like that provides strong re-entrancy protection since a call frame holding the mutable handle to self would make any re-entrant calls fail as soon as they attempted to modify storage.

cburgdorf commented 4 years ago

I think it's worth exploring whether this can serve as re-entrancy protection as well. If you went the rust route and were able to enforce there only ever being one mutable handle to self then it at least seems like that provides strong re-entrancy protection since a call frame holding the mutable handle to self would make any re-entrant calls fail as soon as they attempted to modify storage.

Yep, in rust a method that takes mut self can not even be called recursively (Because the second call would try get a mutable handle which isn't possible because the first call is still holding it). But if I understand correctly, this is a zero cost compile time guarantee that the compiler can make because it is in total control. In the Ethereum re-entrency case, other potentially unknown code can call into us which is something we can not prevent free of cost at compile time. We could still compile mut self down to something that enforces re-entrency protection, but it wouldn't be a zero cost abstraction. But maybe that is fine. E.g. the compiler can be smart enough to only add re-entrency protection to public methods that need it and leave it out for any other private methods that take mut self (it will still enforce the basic rules e.g. wouldn't be callable recursively).

I personally think that this is a very interesting idea.

g-r-a-n-t commented 4 years ago

Open Question: Would such a method still be callable as self.does_not_read() or only as Something.does_not_read() ?

I think using Self.does_not_read() or self.does_read() makes sense.


Are there instances where reentrance is desirable? If so, how should we allow it in the language (assuming it's disabled by default)?


Something else worth thinking about here is how storage pointers would play in with this. Say for example we have this code:

contract Foo:
  my_array: u256[100]

  def bar(self, mut sto_array: Storage<u256[100]>):
    sto_array[26] = 42

  pub def baz(mut self):
    self.bar(self.my_array)

note: storage/memory modifiers have not been introduced to Fe yet, but it has been discussed.

In the example above, bar is supposed to be a non-state-modifying function, since self is immutable. However, this has been bypassed by passing in a separate parameter referencing some mutable piece of data in storage. Assuming we added support for explicit storage pointers (like Solidity), there would need to be an extra compiler check that makes sure any storage pointers passed into non-state-modifying functions are also immutable.

In general, I'm hesitant to add storage and memory modifiers to Fe for reasons like this. It complicates compiler implementation and makes things more confusing for devs. I think that in Fe, mutating state should always happen in a statement that begins with a mutable reference to self. This makes the behavior of contracts very clear.

For now, I think the simplest thing we can do is require that all parameters be passed by value or memory pointer. So rewriting the initial example:

contract Foo:
  my_array: u256[100]

  def bar(self, mut mem_array: u256[100]):
    mem_array[26] = 42

  pub def baz(self):
    self.bar(self.my_array) # my_array is copied to memory

This way, state is not modified within bar, as you would expect given the immutability of self.

Further down the road, we could introduce more gas-efficient behavior based on mutability. For example, say we have this code:

contract Foo:
  my_array: u256[100]

  def bar(self, some_array: u256[100]) -> u256:
    return some_array[42]

  pub def baz(self):
    my_num: u256 = self.bar(self.my_array) # my_array is passed by storage pointer since its an immutable reference

In cases where a storage value is being passed into some function as an immutable parameter, we wouldn't need to copy it because we know that that storage value will never be modified.

Even in cases where we're passing an immutable storage value into a pure function, we could skip copying it since we know that the function will not modify the value.

pipermerriam commented 4 years ago

@g-r-a-n-t

I have a question about this example:

contract Foo:
  my_array: u256[100]

  def bar(self, mut mem_array: u256[100]):
    mem_array[26] = 42

  pub def baz(self):
    self.bar(self.my_array) # my_array is copied to memory

I would expect the compiler to throw an error on self.bar(self.my_array) because self should be immutable in that context, and thus, I would expect self.my_array to also be immutable, and since the function signature of bar specifies that it expects a mutable array type, that the type system would complain. What does rust do in a case like this?

g-r-a-n-t commented 4 years ago

Ah, yeah good point. This is tricky.

In Rust, you would get the error you've described (assuming you don't have ownership issues).

We could enforce the same rules as Rust, but it is of course unnecessary since there is an implicit copy happening. I suppose we could consider adding some sort of copy method so it's more clear that you're not actually passing a mutable storage value.

contract Foo:
  my_array: u256[100]

  def bar(self, mut mem_array: u256[100]):
    mem_array[26] = 42

  pub def baz(self):
    self.bar(self.my_array.to_mem()) # my_array is copied to memory
g-r-a-n-t commented 4 years ago

Also, I'm thinking in terms of ownership here, so self.my_array.to_mem() would be moved into bar where mutability is set by the parameter declaration.

This on the other hand would fail:

pub def baz(self):
  mem_array: u256[100] = self.my_array.to_mem()
  self.bar(mem_array) # error: bar would mutate an immutable reference
  self.do_something_else(mem_array)
cburgdorf commented 4 years ago

Are there instances where reentrance is desirable?

Good question. I have a hard time thinking of any but it's probably worth reaching out to people that regularly write smart contract code.

As a comparision, in Vyper one would use the @nonreentrant(<unique_key>) decorator to protect against reentrancy.

But, I like the fact that we wouldn't need something like a special decorator for this because it would keep the language simpler and safer after all.

If so, how should we allow it in the language (assuming it's disabled by default)?

One idea would be to introduce something like RefCell which could give more fine grained mutability via something like borrow_mut. But honestly, that's a route of complexity that I'd rather like to avoid.

Another idea would be something like a decorator @allow_reentrancy but I'm still skeptical about any valid use cases. I would like to see a case that would really need a state mutating function to be reentrant and couldn't be rewritten another way.

For now, I think the simplest thing we can do is require that all parameters be passed by value or memory pointer

Yeah, I think I'm on board :+1:

cburgdorf commented 4 years ago

Here's a case from Aragon: https://mobile.twitter.com/izqui9/status/1317826431527886849

cburgdorf commented 3 years ago

Ignoring all the mutability stuff for now but I think we should prioritize making self mandatory for non-static functions sooner than later. The reason for that is that it may proof very useful related to struct support which is coming with #203.

One of the planned features for structs is to allow functions on structs (just like in Rust). Now, functions on structs could be very useful for both as instance methods or static functions as demonstrated with the following example.

struct Rectangle:
    width: u256
    height: u256

# Static method to create a rectangle that happens to be a square    
def square(size: u256) -> Rectangle:
    Rectangle(width=size, height: size)

# An instance method to check if the rectangle can hold another rectangle
def can_hold(self, other: Rectangle) -> bool:
    return self.width > other.width and self.height > other.height

Taking self as the first parameter is what would decide whether something is a static method or an instance method (just like in Rust)

Usage:

big = Rectangle(width=10, height=20)
square = Rectangle::square(5)
big.can_hold(square)
sbillig commented 3 years ago

Basic self parameter support was implemented in #520. I guess I'll leave this open for now because of the other ideas in the discussion thread.