define rules and cleanup dependency tree to enable sustainable releases in the future

stellar / rs-soroban-env

Rust environment for Soroban contracts.

Apache License 2.0

61 stars 43 forks source link

define rules and cleanup dependency tree to enable sustainable releases in the future #289

Closed MonsieurNicolas closed 1 year ago

MonsieurNicolas commented 2 years ago

To expedite development we've postponed looking at the dependency tree with "sustainability" in mind.

In particular we have not defined goals (potentially conflicting) for the different parts of the code.

Something like:

Code that gets deployed at the core layer (like host functions, and native contracts) have the most extreme restrictions, and have to go through the CAP process, in particular:
- we have to guarantee backward compat at the "bit level" (cannot change behavior/meta for historical transactions)
- highest bar for security and performance review (this has implication on dependencies that have to be reviewed every time they get revved)
- "never panic" (especially relevant to "native contract" type of code that depend on larger parts of the SDK)
- release cadence is probably at most "quarterly" (with possibility of emergency security release, which adds a layer of "branching")
SDK updates that may impact contracts
- certain changes may be reflected in a "SEP" when defining "APIs" or semantics expected to be maintained across (potentially different versions) contracts
- dependency monitoring on a case-by-case basis
- shipping cadence ~monthly
Code that gets deployed as part of the SDK to support the "developer experience"
- Test, fuzzing, performance related for example, or to support the IDE
- No impact on contracts themselves
- No particular restriction on shipping (can be nightly)

Those goals need to be reconciled with how the various components are organized, depend on each other (some dependency injection/inversion may be needed) and released.

MonsieurNicolas commented 2 years ago

Looking at a recent CVE in Go's bignum package, we may also want to ban certain APIs (that tend to create certain classes or security issues) for code that ends up at the core layer. For example, anything related to marshalling from/to strings as code in that space tends to do "way too much stuff" and it's probably not practical to code review all that code anyways. We'll need a linter of sorts to help ensure that those crates are "pure" in that respect.

leighmcculloch commented 2 years ago

Code that gets deployed as part of the SDK to support the "developer experience"

The test experience in the SDK is largely supported by the soroban-env-host crate, because the meaningful logic supporting testing is the Host, so while there is some functionality to support testing in the SDK, it is very minimal, so minimal I don't think it is worth the overhead of engineering a separation for.

The above mostly discusses the distinctions between the core layer and the SDK, however there are layers above the SDK that are much easier to separate from the core concerns. Most functionality in the SDK can be broken up into three parts:

A thin layer on-top of unsafe operations. This is very delicate, and an intimate knowledge of the host is required. We should be very careful how we modify this.
Generation of contract specs and type conversions. We have to be careful with compatibility with this code.
A thin layer of test utilities built on top of the soroban_env_host::Host. This is tighty coupled. The SDK is a consumer of the Host API in a very similar way that Core is a consumer of it. It is extremely important that the SDKs use of the Host is as consistent as possible as Core's because a contracts ability to reliably test using it is dependent on this.

Overall there seems to me to be an inherent tight coupling of the concerns in the SDK and the env crates, and significant coordination between the two, even though there exists solid interfaces between the two.

However, if we step away from the SDK, there are plenty of other projects that very easily distance from the guarantees of these components. For example, soroban-rpc, soroban-cli, dapp development, and all of the client libs are very far removed from the guarantees we need to provide in core or the SDK. It's much easier to iterate independently on those layers.

Here's a diagram illustrating existing dependencies and the types of concerns I see showing up in different areas:

soroban-whats-been-built-and-areas excalidraw

cc @graydon @tsachiherman @ire-and-curses

MonsieurNicolas commented 1 year ago

The test experience in the SDK is largely supported by the soroban-env-host crate, because the meaningful logic supporting testing is the Host, so while there is some functionality to support testing in the SDK, it is very minimal, so minimal I don't think it is worth the overhead of engineering a separation for.

currently yes (and I hope we can keep it that way), what I envision is that there should be things build on top: that's why I mentioned fuzzers, but I also imagine things like special hooks to facilitate verification.

In any case, what I think we need to do here is actually codify expectations for the different layers that @leighmcculloch identified so that we can properly trim & enforce the type of dependencies we allow in each crate (automation may follow, we first need to define the rules). I think this has some implications on code review & code pinning strategies and potentially on testing strategies too (core level changes are extremely heavy to implement and test and we don't want this outside of core).

leighmcculloch commented 1 year ago

trim & enforce the type of dependencies we allow in each crate

The real challenge with being selective will not be the direct dependencies we've picked, of which there are a few but reasonable number, but the deep tree of transitive dependencies.

For example, these are the direct dependencies we have today:

stellar-xdr has base64, serde, serde_with, hex, arbitrary, clap, serde_json, thiserror. About half of these are excluded from builds that stellar-core runs as they are only used for the stellar-xdr CLI.
soroban-env-common has wasmi, serde, static_assertions, ethnum, arbitrary, num-traits, num-derive.
soroban-env-guest has static_assertions.
soroban-env-macros has syn, quote, proc-macro2, itertools, serde, serde_json.
soroban-native-sdk-macros has syn, quote, proc-macro2, itertools.
soroban-env-host has wasmi, static_assertions, sha2, ed25519-dalek, curve25519-dalek, rand, rand_chacha, hex, num-traits, num-integer, num-derive, log, backtrace, k256, getrandom, sha3, tracy-client, env_logger, itertools, log, tabwriter, thousands, textplots, wasmprinter, expect-test, more-asserts, linregress.
soroban-sdk has arbitrary ,ed25519-dalek, rand, ctor, hex, proptest, proptest-arbitrary-interop.

In total that's 40 unique dependencies.

Contrast that with 172 total dependencies in the rs-soroban-env repo.

We might be able to trim some of these. For example, the hex, num-traits, num-integer, num-derive, and itertools libraries could probably be replaced with a small amount of our own logic if necessary. Not confident at all on the remainder though.