filecoin-project / FIPs

The Filecoin Improvement Proposal repository
312 stars 166 forks source link

Add a VM to Filecoin (EVM, WASM, SES, LLVM, etc) #113

Closed jbenet closed 2 years ago

jbenet commented 3 years ago

Full smart contract capabilities will come to Filecoin, it has been in the plans since the beginning. Many people ask this, so i'm starting this issue to track the conversation. We should submit proper FIPs to add the capabilities.

jbenet commented 3 years ago

There are many folks looking at this, and I hear many people want to build this before the end of 2021. I would love to see this happen, and would be very supportive of such a FIP.

rjan90 commented 3 years ago

+1. Would love to see full smart contract capabilities in Filecoin.

likhita-8091 commented 3 years ago

+1, but I want to ask, if smart contracts are added, does it mean that it has the same functions as Ethereum? Is there a bit too much opponent? After all, there are too many public chains competing with Ethereum, such as Polkadot, near, Binance, Huobi, etc. Can they catch up? And if smart contracts are added, will the economic model of fil tokens also change?

Fatman13 commented 3 years ago

How much is the estimated chain size increase if EVM like smart contract is realized?

deltazxm commented 3 years ago

Start with the EVM.Do it

expede commented 3 years ago

It sounds like the EVM is the likely choice already, but I've been pointed at this thread so I should probably give my two cents. I previously worked on improving the design of the EVM, and have given the topic of blockchain execution some thought. These are just my opinions, and hope that they’re helpful or at least thought provoking. I haven’t been directly in the Ethereum space the past year-and-a-bit, but from what I can tell, all of the below is still relevant.

One of the biggest takeaways from my time in Ethereum was that it’s impractical to retire a public network’s VM. Even with the eventual switch to ETH 2.0, there’s an expectation that existing contracts will live on. Not doing so breaks the implicit contract with users. Clearly the choice of VM for FileCoin has not been taken lightly so far, but a cautionary double underline on the fact that the one(s) you choose will be with you for a very long time.

TL;DR Recommendation

Alternately, consider work being done on “enhanced” EVM-compatible VMs, such as IELE (semantics)

Opportunity

Why should someone use FileCoin instead of <insert blockchain here>? There are quite a few chains that provide very similar solutions to each other. I believe that FileCoin has a strong differentiator having started with storage built-in. Most blockchains can be seen as extremely high availability data stores, but where the cost of storage is very high. FileCoin very explicitly builds an open market for data, and can efficiently provide both storage in volume and potentially efficient execution.

Content addressed distributed storage coupled with an open compute platform has a ton of potential beyond what we see today: we can store the output of any function (including partial applications), run adaptive optimization (hotspot) across all SCs globally & collaboratively(!), embarrassingly parallel computation on large datasets, and so on. IPFS bridges to the web and is already deeply embedded in this ecosystem.

Multiple Engines

Multiple engines can be introduced — even the wording used in FileCoin today (actors) suggests a well-isolated approach.

If code from multiple engines can interact, two things happen: you inherit the weaker invariants between the systems, and new unspecified behaviour can emerge from the interaction of the two. Many properties do not compose, and security in particular can break in unexpected ways.

The EVM

It is very clear now that the EVM has become THE smart contracts standard.

Agreed. The EVM leaves quite a lot to be desired (see below), but from a social standpoint it has won. In many cases it’s a required checkbox to say that you have a smart contract platform at all (true story).

The EVM is surprisingly low level for a platform that so heavily handles finance. It’s a difficult balance to strike, so I see how it got here, but it’s also not ideal. It’s possible but very difficult to formally verify EVM bytecode, largely due to the unstructured nature of the ISA (in the “structured programming” sense), and you really do need to verify the bytecode output as a source of truth.

It’s also quite inefficient for common classes of computation due to unusual word size. This isn’t all bad: having 256-bit words actually makes it more efficient than Wasm for a lot of cryptographic applications. I think that this mostly speaks to the advantages of tuning your ISA to the use case. Further, being a (more) special-purpose smart contract focused VM means being able to add instructions for the use cases that this community cares about without having to content with the very broad use cases that the Wasm WG is (rightly) concerned with.

Compiler Backend for Solidity > EVM Bytecode

Historically there’s a perceptual conflation between Solidity support and the EVM itself in the Ethereum community. There are many languages that target the EVM, but the overwhelming majority of code is in Solidity. When folks say that they want EVM-compatibility, it’s less that they want it to run already compiled EVM bytecode, and more that the platform should be a compile target for Solidity. This is definitely achievable, though some bytecode hacks may not translate. Many of the other EVM-targeted languages have reasonably portable IRs, which can also target a different backend.

Formally Verified VMs

There’s a distinction between formally verifying the VM, and a well specified ISA or IR. We can formally verify brainfuck, but it won’t be very easy to work with or analyze. The big question is “which properties should a blockchain execution engine have?” I agree that Michelson does a good job here, but is also certainly not the only option (DAML, Formality, &c)

Were I designing such a system, these are some of the things I’d focus on (off the top of my head):

Build an IR or higher level ISA, include useful semantics for the domain (storage, deals, ID, auth, &c) and any relevant syscalls.

Emphasize totality. Turning completeness means that code can run forever, which is an explicit anti-goal for public blockchain execution. Ethereum caps the amount of gas per transaction as a form of artificial forced termination. Turing completeness also makes formally verifying a smart contract more difficult. Enforcing total functions for even just most operations (ideally all) means that you can run all kinds of optimizations, strip the runtime cost dynamics from overhead and get much more accurate ahead-of-time gas costs as part of adaptive optimization. This is often achievable without loss of expressiveness in a blockchain context since it's already going to be forced to terminate in a finite number of steps.

Determinism is a given, since this is a public blockchain. Deterministic concurrency would certainly be a nice-to-have.

They’re already called actors; use a message passing model and enforce strict isolation. It's also very natural to most programmers.

Bake-in first class exceptions and well-defined signalling codes (think HTTP status codes or Erlang’s convention atoms).

Get a correct-by-construction spec during the design phase (i.e. no undefined behaviour). Make it easy to write SCs that aren't broken. I personally think that Runtime Verification does beautiful work in this area.

Remember Tesler's Law, and that a dogmatically "simple" ISA means that you're pushing complexity to the developers (see again Turing Taript).

————

That ended up being longer than I initially planned — I guess I do have opinions 😅 I’m (clearly) happy to share what I’ve learned in this space, and to respond to any of the above 🙏

hannahhoward commented 3 years ago

A good use case to consider in shaping things: https://github.com/filecoin-project/devgrants/pull/253

raulk commented 2 years ago

Hey @expede! Thanks for the thoughtful reflection, your input is very much appreciated! 🙌

I very much agree with what you've identified as the unique value proposition of Filecoin. Concretely the ability to:

ICYMI, check out this presentation on the Dataverse

raulk commented 2 years ago

In the last weeks, we've had the opportunity to discuss and flesh out potential directions for the Filecoin VM. We also sketched out an implementation plan and roadmap for the work ahead.

I'm working on summarising the outcomes from the discussions. I'll submit a detailed draft of a technical architecture to filecoin-project/fvm-project in the next days (please shout if you’d like to be tagged for a review!)

In the meantime, I’m outlining some highlights and takeaways on the current thinking to keep the thread going.

Current thinking


cc @expede

raulk commented 2 years ago

With regards to execution roadmap, we're thinking about the following phasing. I'm omitting dates for now, as we're actively prototyping and identifying unknowns that may impact timelines.

Proposed FVM roadmap

This is a tentative roadmap and is subject to heavy change!


Coordination with implementors is needed to align on feasibility and timelines (cc @kaitlin-beegle // Filecoin Foundation).

cc @filecoin-project/lotus-core @filecoin-project/venus @filecoin-project/forest @filecoin-project/cpp-filecoin-team @filecoin-project/actors-committers

expede commented 2 years ago

Hey @raulk, thanks for the CC! I’ve been meaning to tap you and/or @Stebalien for a chat, but you beat me to it! Also thank you for bringing this discussion into the open 🎉

Preamble

I’m outlining some highlights and takeaways on the current thinking to keep the thread going.

At risk of taking up too much space (😅), I’m more than happy to keep the discussion alive! I thought that I was done with the EVM a few years ago, but I'm clearly still pretty passionate about this topic. I've been getting pulled back into EVM discussions in Ethereum again, too... its the year of the EVM, I guess! "Just when I think I'm out, they pull me back in" 😆

I am, of course, thrilled that this work is moving ahead! Below I’m mostly focusing on my open questions and where I have concerns. I want to highlight that I like most of this plan, but the nature of these discussions leads to focus on where the vision diverges. (I’ve also been very happy with many of the choices in the existing chain)

I feel like I’m missing context on a few of the items in the current plan? If I’m asking questions that have clear reasoning behind them, it may still be useful as feedback for where to expand the text!

Finally, please let me know if at some stage decisions are locked and it would be most productive for me to switch into “how to best to execute this plan”

Vision

check out this presentation on the Dataverse

Agreed, the Dataverse presentation has a great vision — very much aligned with how @bmann and I think about this. I actually think that there’s more untapped potential to build on top of that, but what they presented is a fantastic and very nessesary first step 🎉

Operational Semantics

This choice guarantees almost-perfect execution fidelity

Good news! You can make that totally perfect execution parity if you design and verify the formal semantics for the enclosing environment (plus the VM). [This general concept comes up in a few places below]. This is much more difficult to do after-the-fact, so it’s a good time to talk about it!

Wasm vs LLVM

Yup, it makes sense if you want to use a widely used off-the-shelf VM if you’re not going to design a Filecoin-specific VM 👍

Hypervisor

This is the most controversial part for me. This strategy can work, but it’s all complexity trade-offs. I’d be interested in hearing more about the use cases that this solves for. What are you expecting to need to run that Wasm and the EVM can’t?

I understand that this doesn’t mean that there will be more VMs. When I hear “hypervisor”, it suggests that there’s an intention to. If there’s no concrete plans, is this premature generalization? Should this be an implementation detail and not widely advertised?

I absolutely see how a hypervisor strategy sounds appealing: it’s expedient and flexible for future extension. You need something of this shape for Wasm at all, and the EVM for sure. There definitely needs to be a clear interface that your execution engine will interact with to access platform services like accounts, storage, and system actors. It’s not that far a leap from a clean interface to seeing it as an abstraction that many VM can execute against.

I could be misunderstanding the strategy, but I don’t think that those are the right tradeoffs for a blockchain. They’re a huge target for abuse, and actor bugs often can’t be fixed without a heavy consensus process. Behaviour and security hardening are very important, and extensibility at such a low-level makes that significantly more difficult. Even though the actor execution is itself pure (at least that’s how I’m imagining it), when you want to run an audit, you now need to look at the internals of each actor to ensure that they behave correctly.

Every VM you adds introduces its semantics to the environment. Once you’ve introduced a VM, it’s extremely painful to remove it. The EVM itself is a collection of EVMs as it continues to evolve (do you execute under the previous environment or potentially break some subset of actors?). Audits are possible in the environment — you can formally verify anything with enough effort — but are made significantly more complex. Invariants in one VM may not hold in another. Simple off-the-shelf tooling like what you find in Solidity tooling will help you a lot less when you’re calling out to Wasm-based actors, and vice-versa. You need to move up to something like a custom K toolchain that can compose the semantics for your various VMs. These are entirely possible to do, but much more specialized (read “less accessible”).

It’s part of the tradeoff of accepting both bytecodes rather than moving the compatibility higher in the stack (more in the EVM section below).

EVM

EVM-in-Wasm

We're looking at adopting SputnikVM, which has emerged as the de-facto EVM-in-WASM choice in the industry.

Agreed, Sputnik makes sense to me 👍

I’m a bit unclear about what gained by compiling this to Wasm instead of treating it as a native engine? Not that it’s a problem, but why the emphasis? I’m guessing that you’re not going to be metering the execution of the EVM itself or using the other specialized Wasm-on-Filecoin functionality. Is it to be quirks-compatible across Filecoin implementations (Lotus, Forest, etc)?

There’s nothing lost in specifying Wasm other than some very minor overhead. I’m mostly unsure why the execution environment is called out specially given the hypervisor strategy earlier.

Upstream Strategy

Is the plan to follow Ethereum Mainnet’s changes as their EVM evolves? For a while I’ve thought that it may make sense to start a chain-agnostic EVM standardization group.

Content Addressed Code

WASM bytecode requires no further compilation/transpilation to be executable, and thus it's suitable for secure, content-addressable code

Content addressed code 🚀🚀🚀

Despite my enthusiasm for the concept, this section confused me a bit, given the goal of including multiple VMs. Is the idea that the “primary” environment will be Wasm? If so, then why the hypervisor?

This holds whether or not the VM is implemented in Wasm itself. Since the byte codes are interacting at the same level in the stack, their implementation language doesn’t matter (other than cycle efficiency, which Wasm by itself won’t be the limiting factor). The thing being executed is their respective bytecodes. By analogy: all of the VMs on my laptop run the same x86 instructions underneath, but that doesn’t help me ensure that they’re running correctly.

I could be missing something, though!

Compatibility

I’m going to treat the rest of this section as orthogonal to whether or not there’s a hypervisor

EVM compatibility: the proposed plan is to adopt an EVM-in-WASM approach, instead of working off Solidity.

(There’s definite irony in me arguing in favour of what is essentially eWasm, but the context is different from Ethereum Mainnet).

I’m not totally against putting in an EVM, but as mentioned in my earlier post, I think it’s a less straightforward strategy than it appears on the surface. You’re trading off one complexity for another. I think it’s actually less tractable at this layer, but also recognize that there are very real social forces pushing to adopt the EVM directly.

I certainly see why adopting the EVM directly is appealing! You can label Filecoin as having “a real EVM”, and will be certain that it executes consistently with other EVM-based chains — which is a big deal!

However, I think this specific strategy will cause other challenges down the road, and that there are other options that work just as well. I understand that the pull of the EVM is strong in 2021, so the below exploration may not be useful and/or retreading ground. I think It’s worth stating here in public for sake of completeness and so that we can understand the tradeoffs we’re signing up for.

Yul

mitigates risks otherwise present in the Solidity => WASM path.

It’s absolutely one way to make it work consistently 👍 It also introduces other challenges which I don’t see addressed in the list.

The fundamental question is “can we preserve the EVM’s big-step semantics for execution on Filecoin?” I believe the answer to be an unqualified “yes” with or without an actual EVM. It does mean ensuring that your backend respects the correct semantics, but that can be done once.

Have you explored Yul as a possible solution? Solidity and Fe target Yul, which itself already has formal semantics. There’s some effort in this direction direction — e.g. Solidity’s Wasm backend, SOLL — but admittedly it’s still early.

[Yul IR] Support for EVM 1.0, EVM 1.5 and Ewasm is planned, and it is designed to be a usable common denominator of all three platforms. It can already be used in stand-alone mode and for “inline assembly” inside Solidity and there is an experimental implementation of the Solidity compiler that uses Yul as an intermediate language. Yul is a good target for high-level optimisation stages that can benefit all target platforms equally.

Unlike the case where you have multiple VMs and pushing the complexity to the actor authors, you can invest in one good backend, and now all of your Wasm byte code tools work while having guarantees that it is compatible with (e.g.) Sputnik execution. The tradeoff is that as the EVM itself evolves, you need to update this backend.

Outro

Oh no, I wrote the equivalent of a small novella again 🙈 I hope that was helpful! (Always happy to get on a call as well, btw)

expede commented 2 years ago

Tangential, but one more thing: with the rise of the EVM outside of Ethereum, it's getting to be time for an "All EVMs" working group of some kind. There's more to consider than Ethereum Mainnet these days, and it would be good to find a way for everyone to work together.

Shekelme commented 2 years ago

Will miners (storage providers or just any participant with relatively powerful computational resources in hands) be able to be validators in such a network with smart contracts? A stimulated opportunity to be a validator would be great.

rakita commented 2 years ago

I somehow stumbled here, but on the EVM topic, maybe I can interest you in a very fast and flexible EVM implementation written in rust: https://github.com/bluealloy/revm

Either way, whatever is chosen you can't make mistake on the technical decision, it is more question if you want your users to start writing Rust/C++ or Solidity. For example, Polkadot/Solana/Near have added some kind of support for EVM because there is a big community of devs/code behind it that they want to leverage and that is a perfectly reasonable business decision and the user usually does not care what is behind it.

raulk commented 2 years ago

@Shekelme becoming a block producer in the Filecoin network requires holding storage power to be listed on the power table, a weighted set from which block producers are elected through Expected Consensus. Two comments on future opportunities:

  1. With the FVM, it could theoretically be possible for storage providers to engage third parties as block producers, by delegating their stake/power to produce blocks on their behalf. There are many considerations to take into account (e.g. rogue third parties, slashing, etc.) and it may ultimately not be a good idea, but it's worth exploring. The actor would also escrow the protocol rewards, so that the incentive agreed between the provider and the delegated producer are apportioned according to the agreement. This is essentially the inverse of a "mining pool", where "miners" come together to aggregate power. In this case "miners" would offload the block production task to third parties that are aligned with incentives through an actor.
  2. ConsensusLab is a group at Protocol Labs that's researching and prototyping novel stake-based hierarchical consensus algorithms that may delink the storage power element from the consensus protocol itself, allowing any party to participate in a shard by staking FIL. Check out this talk from Filecoin Orbit. We think this is the future of Filecoin consensus, but it's still in its early days.
raulk commented 2 years ago

@rakita I like fast and flexible. Thanks for the link, I'll check it out. There's the draft EVM <> FVM mapping spec in case you have some bandwidth to take a look.

Concerning strategy, you've pretty much nailed it. The way we're thinking about supporting EVM bytecode deployment is to enable the existing smart contract devs to port over their knowledge and battle-tested/audited Solidity contracts to Filecoin without having to climb a learning curve.

Personally I regard the FVM native runtime as the primary, optimised development target, but we are shooting for a multi-VM model that enables other dev targets to run seamlessly on the Filecoin network through shimming.

rakita commented 2 years ago

@raulk it seems you covered a lot in that document, it was nice reading it, thank you.

juntao commented 2 years ago

@jbenet @raulk @expede @rakita

Hello,

My name is Michael Yuan and I am the maintainer of WasmEdge. I would like to propose a collaboration between WasmEdge and FVM projects. :) There are several compelling reasons for choosing WasmEdge as the underlying WASM engine for FVM.

WasmEdge is an official (and only) WebAssembly runtime project hosted by the Linux Foundation / CNCF. It is fully standards compliant, is among the fastest WASM engines available on the market, and provides integration SDKs for C, Rust, GO, and JavaScript. It is used in many Cloud native frameworks and applications including upstream integration in the Kubernetes ecosystem. You can read its technical highlights here:

https://github.com/WasmEdge/WasmEdge/blob/master/docs/highlights.md

WasmEdge is also the most “blockchain friendly” Wasm engine. Among other things, our community is currently working on porting the Substrate framework to WasmEdge.

WasmEdge is also the leading Ewasm solution for on-chain VM solutions. It passes all the test suites EF created for Ewasm. In fact, WasmEdge-based Ewasm chains are already deployed in Substrate and Oasis ecosystems.

https://github.com/ParaState/substrate-ssvm-node https://github.com/second-state/oasis-ssvm-runtime

Furthermore, the WasmEdge community created the SOLL compiler to compile Solidity source code to Ewasm bytecode. It is work in progress but already compiled ERC-20 / ERC-721 and aims to support Uniswap v3 by the end of the year.

https://github.com/second-state/SOLL

We look forward to collaborations between our two communities!

aronchick commented 2 years ago

Hi @juntao! Thank you so much for reaching out - we're really looking forward to working together.

Our reference implementation right now uses Wasmtime and is likely to remain with that choice because it’s just easier to debug and bugfix if the entire thing is in Rust. HOWEVER, alternative FVM implementations may arise using other runtimes, and we'd love to support that. Would you be interested in exploring an alternative implementation with WasmEdge - we'd be happy to support you, including, potentially, helping to fund such an effort?

Please don't hesitate to reach out directly - david.aronchick@protocol.ai :)

AthenaPoolOfficial commented 2 years ago

Adding a VM to filecoin will be very helpful for filecoin ecosystem ,and it would be better if we can make a further optimization for the Market.