What special features make sense for a smart-contract oriented VM ?

mvertes commented 1 year ago

In relation with the prototype byte-code VM showcased in gnolang/parscan#1, the goal of this issue is to get answers to the following question:

What fundamental problems about smart-contract platforms could benefit (or not) to be tackled at a low-level VM level?

Or to be more specific, what features a VM should have in the context of a distributed smart platform:

checkpoint / restart (pausing / resuming from snapshot) ?
persistence (of what, and how is it defined exactly) ?
serialization of data? code?
code debugging means ?
code integrity checking?
merkleization (of what exactly ?)
backward/forward replay (i.e for speculative execution ahead of consensus) ?
parallelism / concurrency constraints ?
resource accounting (memory storage, instruction counting, gas cost)
deterministic execution?
other?

thehowl commented 1 year ago

So, let me try to answer some of these:

checkpoint / restart (pausing / resuming from snapshot) ?

persistence (of what, and how is it defined exactly) ?

Right now we have persistence of global variables of realms (defined right now as being packages whose import path starts with "gno.land/r/", but this is subject to change). I think this is connected also to the first question of checkpointing; we need the ability to save global variable state and to use the vm to call specific registered calls (this is a requisite to do smart contract calls through RPC, ie. gnokey); however we don't need the ability to stop the execution after any given statement.

parallelism / concurrency constraints ?

So we are currently not supporting goroutines or channels. Trying to use goroutines or channels will simply lead to a panic right now.

Multi-threading leads to non-deterministic behaviour, which is why if we were to implement something like this in Gno it would have to be probably coroutines, with very specific deterministic behaviour on how it would work. On this, I think we should obviously follow along and possibly implement in our standard libraries the coroutines implementation/proposal for Go; though that's probably a matter for future discussion.

But actually, I think this topic is another Pandora's box that we need to tackle as an issue of its own; specifically, trying to answer what would be the use-cases for a goroutine "escaping" execution of a smart contract and what it could potentially allow on the blockchain.

serialization of data? code?

merkleization (of what exactly ?)

Relating also to the persistence question before, and taking out the need to serialize channels above, most data types can be serialized quite easily. Specifically in the gnovm we're serializing data using amino marshalings of the internal representations of the values (see gnovm/pkg/gnolang/values.go).

Of course, function values are something we need to serialize but which is a bit harder, as you rightly pointed out this is about serializing code. In gnolang/op_call.go, we call GetBodyFromSource; so ultimately in the gnovm itself we don't serialize statements themselves, but rather just where to get them back from if we need them again.

A VM implementation also needs to correctly handle pointer values and allow, for instance, for circular data structures.

I've done some digging on the inner workings of the VM for my native bindings work; but there's much I still don't know about merkleization, for instance. As far as I can understand, merkleization doesn't happen on the vm level, but rather on the tm2 node level. I'll leave @moul to elaborate on this and whether it's useful to do merkleization work on parscan.

deterministic execution?

Not clear what you're asking here; the short answer is yes.

But yes, in essence, as we are in a blockchain context with potentially thousands of validators, each making sure that all the transactions (individual vm executions) run correctly, determinism is of utmost importance, hence why we cannot have things such as "real" cryptographic randomness, fs access, network access, and so on.

I think this leaves us as well with many chances for optimization, as we are much closer to being a functional language than Go can be. Many functions can be memoized or lazily-evaluated if static analysis deems it convenient.

In terms of language features, I don't think this changes much from Go, though, as I think the underlying change that we'll have has to be in the standard libraries. Language-wise, I think there are only a few things we need to have for deterministic execution:

(u)int have a fixed, machine-independent size, and I would suggest 64-bits (but they are not an aliases for (u)int64)
unsafe.Pointer cannot be implemented, as well as uintptr, unless we want to delve into an attempt of giving access to virtual memory whilst enforcing the restrictions we need for the language (ie. can't modify the unexported global variables of another package/realm).
map ranges are deterministic (could still be pseudo-random, but based on a fixed and spec-defined algorithm and a seed possibly originating from the block time or number).

resource accounting (memory storage, instruction counting, gas cost)

So yes, we need resource accounting; although I think gas cost is something that we should define at the blockchain level rather than at the language or vm level.

Counting memory storage and instruction counting would likely also have to be deterministic if we want to envision a future where we can have multiple implementations of Gno even on a single blockchain. So what is an operation that is counted as a "cpu cycle" or how much memory is allocated by each type of value is something that we actually need to answer at a potential spec-level than an implementation-level.

While writing this I'm realizing that if we are to develop multiple VMs that can run gno code, and we want to see a future where they want to interoperate, the time may have come to think about writing a formal language specification, probably forking Go's. Also because many of your questions here today arise from the fact that we are currently lacking one, and many of these questions are currently scattered around the gno monorepo issues and PR.

code debugging means ?

I think this is useful but an implementation feature rather than a necessary feature?

I would love if we had interactive ways of debugging gno code, for instance breakpoints and advancing code execution step-by-step. I don't think, however, this necessarily brings intrinsic value to the blockchain context specifically, but it is something useful to have in a programming language.

code integrity checking?

Can you elaborate?

backward/forward replay (i.e for speculative execution ahead of consensus) ?

This would be cool and useful, I think. Although I also think it comes as a simple consequence of state saving/persistence of global variables. Currently, transactions can abort because they (a) panic without a defer-recover (b) run out of gas (c) go beyond of the max cycles or allocs allowed. But additionally, all non-transactional executions of code (vm/qeval, ie. what is powering the Render() function and the gno website as a consequence) work as if implicitly doing a rollback at the end of executing a function, so even if they modify realm state that is not and cannot be persisted.

So yes, I think for forward replay this is what we're implicitly doing on the gnoland nodes when doing the vm/qeval ABCI call, and as for backward replays this would just mean rolling back, even temporarily, to a previous block on the blockchain, so possibly both things that if implemented would mostly be as a consequence of being able to control externally the state data before execution, and the blockchain node allowing it.

other?

Nothing on top of my mind, but I'll be sure to write here if I think of anything.

mvertes commented 1 year ago

Thanks for your detailed and helpful answer.

code integrity checking?

Can you elaborate?

I mean ensuring that the produced byte code is really conformant to the origin source code. See the famous Turing award lecture from K. Thompson. It is not trivial, but must be addressed in some way.

ajnavarro commented 1 year ago

Just a tiny addition to the @thehowl great explanation: We need a way to lazy-load/unload chunks of memory when the program is being executed. A Realm can use data structures that might eventually use GBs of memory, making them impossible to fit on most nodes' memory. I think it can be done fairly easily.

serialization of data? code?

We need to define how the state is serialized into the chain. I had a look into the serialized data on the KV storage, and I had the impression that the payload can be reduced.

other?

Something that came into my mind is versioning. I think we should apply versioning to the source code (to make it possible to update to Gno v2 when needed) and to the state serialization (to make possible future improvements or new OPs if needed, as an example).

mvertes commented 1 year ago

Something that came into my mind is versioning. I think we should apply versioning to the source code (to make it possible to update to Gno v2 when needed) and to the state serialization (to make possible future improvements or new OPs if needed, as an example).

Definitely an important topic to be assessed prior to the launch (even if we decide to do nothing in short term)

gnolang / hackerspace

What special features make sense for a smart-contract oriented VM ? #17