gnolang / gno

Gno: An interpreted, stack-based Go virtual machine to build succinct and composable apps + Gno.land: a blockchain for timeless code and fair open-source.
https://gno.land/
Other
892 stars 371 forks source link

Passing a pointer to another realm that assigns it allows to steal objects ownership #974

Closed n0izn0iz closed 3 months ago

n0izn0iz commented 1 year ago

Passing a pointer to another realm that assigns it allows to steal objects ownership

Description

If you pass a pointer to an object x in realm A to a method of a realm B that stores this pointer, the ownership is set to B

Your environment

Steps to reproduce

package steal_ownership

var ptr *uint32

func Steal(xptr *uint32) {
    ptr = xptr
}
package mis_ownership

import "gno.land/r/demo/bugs/steal_ownership"

var (
    x   = uint32(42)
    ptr = &x
)

func init() {
    steal_ownership.Steal(ptr)
}

func MutateDeref() {
    *ptr = 21
}

func MutatePtr() {
    y := uint32(21)
    ptr = &y
}

func MutateValue() {
    x = 21
}

Expected behaviour

Mutations run correctly

Actual behaviour

Mutations error out

--= Error =--
Data: cannot modify external-realm or non-realm object: object pkg path: gno.land/r/demo/bugs/steal_ownership, realm: gno.land/r/demo/bugs/mis_ownership

Additional info

You can find the full code here https://github.com/TERITORI/gno/pull/4

I discovered it while passing interfaces in this way so this is not limited to raw pointers

I edited the gnovm code to print realm package path in the error, you can see the code here https://github.com/TERITORI/gno/pull/4/files#diff-7041eca77b078f22ee334239995c403673f040b3726bcffa5a8a053b68a951a7

thehowl commented 1 year ago

I'm curious to know if this happens also when moving around slices. They are actually "mutable" data types, so they probably suffer the same problem for the underlying pointer...

n0izn0iz commented 1 year ago

huh, there is something weird, my test on values and pointer now fail at the steal_ownership.Steal(ptr), maybe something was merged that changed the behavior or I didn't check where the error came from properly

but I added the same test for slices and it fails on slice element mutation after successful steal I updated the test in #1074 to reflect my findings

I double checked in https://github.com/gnolang/gno/pull/1072 and assigning foo20 to the avl tree in grc20_registry is allowed and prevents further usage of foo20, maybe it's because foo20 contains slices, I'm not sure

Also, the way cannot modify external-realm or non-realm object is handled is annoying, it can't be recovered in tests so I can't make all cases run at the same time and have to comment stuff or run individual tests to find out which line triggered the panic

piux2 commented 1 year ago

I believe that both #974 and #1072 reflect intended behavior, these are two different use cases.

n0izn0iz commented 1 year ago

these are two different use cases.

No, both do the same thing, A passes a pointer to it's state to B, when B stores this pointer, A then cannot modify it's own state (because it loose ownership of the state) Where respectively, A is foo20 or mis_ownership and B is grc20_registry or steal_ownership

It happens on interfaces and slices for sure. For raw pointers I thought it was the case but I couldn't reproduce afterwards so maybe not

The grc20_registry version is just more complicated, because the storing is done by an avl, but it's the same thing at the core, I opened grc20_registry only to show the pattern that was affected by this bug.

Just to reassess because I'm not sure we understand each other:

On the other hand, Realm A should allow Realm B to modify A's variable state through A's own defined methods. This holds true even if the method is attached to a structure of an interface, which is then passed to Realm B (as seen in https://github.com/gnolang/gno/pull/1072). In this scenario, no unintended side effects arise, as A retains full control within its own methods.

I understand that this is the intended behavior

This is not true currently, this is the bug I'm trying to describe:

(zomg, didn't mean to close the issue -_-)

petar-dambovaliev commented 7 months ago

@deelawn do you have some local work on this? I've talked with @zivkovicmilos and it seems we won't ever need to keep this as a possible feature, the transfer of ownership. That means we can end execution/panic, if that happens.

deelawn commented 7 months ago

I think the issue is that we shouldn't be transferring ownership in the first place. I think there is a deeper issue here. In this case, ownership was transferred inadvertently when it shouldn't have been. mis_ownership should have remained the owner of the underlying data with steal_ownership simply having a pointer to that data. I was debugging this this week and found something interesting -- the block that is the parent object of the pointer saved as part of the steal_ownership realm still has a realm path of the mis_ownership realm. Something strange is happening here.

Then there is the question of how pointer references are maintained by a realm that doesn't have ownership of the data its pointer references. What is the expected behavior if the owner realm (mis_ownership) no longer maintains any references to this data? The ref count would still be one because of the reference from steal_ownership, but should it be? If we get to the point where realms pay rent for the amount of data stored, then a realm might be stuck paying for data it no longer references itself -- it's only referenced by other realms.

Given that, I'm not sure we should disable realms from saving pointer variables that reference data in other realms; this could be very convenient versus having to retrieve a large object directly. I can see two obvious options:

  1. make it abundantly clear in documentation that realms that pass pointers to their own data may incur the cost of never being able to "free" that storage of the realm they pass the pointer to persists the address
  2. modify realm storage so that when a realm no longer maintains any references to data it owns, other realm pointers that reference this data get set to nil

Those are just a few observations and ideas we can discuss before we make a decision.

piux2 commented 6 months ago

We need to specs out the behaviors too

From: /r/mytoken /p/grc20 /r/foo
Import package Y Y
Call exported method Y Y
Modify exported package variable Y N?
Modify package variable through exported function Y Y?
From: /p/grc20 /r/mytoken /p/bar
Import package N Y
Call export method N Y
Modify export package variable directly N ?
Modify package variable through export method N ?
deelawn commented 6 months ago

I think I was wrong in my earlier comment and maybe we have been misunderstanding the issue. This only appears to be an issue during realm initialization. I think this is because the pointer that is passed to the steal realm has not yet been persisted in the originating realm. So when it exits the steal realm, it persists a pointer that has not yet been persisted. Perhaps this is just an edge case we must address; not something larger. For example, the following txtar test passes with no issues because the steal function is not called during initialization.

loadpkg gno.land/r/steal_ownership $WORK/steal
loadpkg gno.land/r/mis_ownership $WORK/mis

gnoland start

gnokey maketx call -pkgpath gno.land/r/mis_ownership -func Steal -gas-fee 1000000ugnot -gas-wanted 2000000 -broadcast -chainid=tendermint_test test1
stdout OK!

gnokey maketx call -pkgpath gno.land/r/mis_ownership -func MutateDeref -gas-fee 1000000ugnot -gas-wanted 2000000 -broadcast -chainid=tendermint_test test1
stdout OK!

-- steal/steal.gno --
package steal_ownership

var ptr *uint32

func Steal(xptr *uint32) {
    ptr = xptr
}

-- mis/mis.gno --
package mis_ownership

import "gno.land/r/steal_ownership"

var (
    x   = uint32(42)
    ptr = &x
)

/*
func init() {
    steal_ownership.Steal(ptr)
}
*/

func Steal() {
    steal_ownership.Steal(ptr)
}

func MutateDeref() {
    *ptr = 21
}

func MutatePtr() {
    y := uint32(21)
    ptr = &y
}

func MutateValue() {
    x = 21
}

Edit: fixing what is described above is not the definitive solution. I edited the original realm's Steal function and it results in a VM panic of should not happen.

func Steal() {
    n := uint32(10)
    ptr = &n
    steal_ownership.Steal(ptr)
}

Or maybe it is right to fail in this case. If so, we can make it return a more detailed error message. I will dig a bit deeper on this.

jaekwon commented 6 months ago

Looking at the init function... that init function which calls "steal_ownership.Steal(ptr)" must fail because ptr refers to the slot of x = uint32(42) of the mis_ownership realm (or should anyways), but the Steal() call makes the steal_ownership realm point to it too, and there is no clear ownership.

before init:

mis_ownership:
  x: uint32(42) // a slot in the mis_ownership realm.
  ptr: &x
steal_ownership:
  Ptr: nil

after init:

mis_ownership:
  x: uint32(42) // a slot in the mis_ownership realm.
  ptr: &x
steal_ownership:
  Ptr: &misownership.x // now two realms are pointing to the same slot. 

desired invariant: after every transaction (and also after init() functions to initialize a realm), there must not be any pointers that cross the realm boundary; every value/slot has a clear owner realm, and references cannot pass the realm boundary, except possibly temporarily during transactions.


still looking...

deelawn commented 3 months ago

Just confirmed this was fixed by #2255