Joystream / joystream

Joystream Monorepo
http://www.joystream.org
GNU General Public License v3.0
1.42k stars 115 forks source link

Substrate coding conventions I #404

Closed bedeho closed 3 years ago

bedeho commented 5 years ago

Add your suggestion as a comment!

Background

Our Substrate code base is starting to get more complicated, and it would be a benefit to harmonise the set of major conventions we follow, so as to follow good best practices, and make reviews more efficient. The goal of this issue is to accumulate suggestions over time, as replies, which we can turn into an eventual convention document. This document can further be turned into rules for our CI linter.

Major questions that

Initial suggestions

mnaamani commented 4 years ago

Avoid type aliased u16 as extrinsic parameter type

If at all possible avoid using u16 in the runtime. If you must use it as the type for an extrinsic/dispatchable call parameter, do not type alias it. It seems to confuse polkadot-js/api and results in an invalidly signed transaction when decoded by a substrate node.

bedeho commented 4 years ago

can_ checker for extrinsics/methods with Imbalance arguments and Result return value

When a call is made to a method

fn my_method(_: Imbalance) -> Result<_,_>

then one cannot do the normal my_method(my_imbalance)? in the caller scope because there would have to be some cleanup in the client code to neutralize the imbalance side-effect. One way to resolve this is to introduce a new

fn ensure_can_call_my_method(_: Imbalance) -> Result<X,_>

which simply checks whether the call to my_method work (i.e. return Ok). Notice that the return type X probably should be introduced, in order to make ensure_can_call_my_method reusable by my_method, which requires exposing relevant information read & processed from storage.

bedeho commented 4 years ago

Aggressively derive standard traits for public types

If types are needed in some client code that lives in another crate, the Rust coherence rule will make life hard for this downstream developer. So just derive as more than what immediately comes to mind, e.g. #[derive(Eq, PartialEq, Clone, Debug)]

bedeho commented 4 years ago

RuntimeDebug over Debug trait

[derive(RuntimeDebug)] also derives Debug for you. But for a wasm build this resolves to a Debug implementation that does not print anything, but it makes the compiler happy

bedeho commented 4 years ago

Parametrise with T::Trait

When defining a new type, always parametrize by therruntime module trait Trait, not the individual types which will later be selected, for the following two reasons

  1. The signatures which result become extremly long, e.g.

    fn ensure_curator_application_exists(curator_application_id: &CuratorApplicationId<T>) -> Result<(
        CuratorApplication<T::AccountId, CuratorOpeningId<T>, T::MemberId, T::ApplicationId>, 
        CuratorApplicationId<T>,
        CuratorOpening<T::OpeningId, T::BlockNumber, BalanceOf<T>, CuratorApplicationId<T>>), &'static str> 
  1. Subtle minor mistakes in instantiation result in error messages which are so long and opaque that it dwarfs any possible benefit.
bedeho commented 4 years ago

Event policy

  1. Any code path which mutates state must have an event emitted in it, hence, all extrinsics must have an even triggered.

  2. The playload for an event must be the minimal number of parameters such that anyone who knows the state would be able to compute the new state solely from the looking at the extrinsic (and its payload) and the event payload.

The main reason for the second point is to make it practical for any non-full node, such as indexing services and light clients to keep up with the state transitions of each individual transaction. Talking to the native API of a full node, even an archiving node, only allows you to access the state as it is after each full block.

bedeho commented 4 years ago

Reusable GenesisConfig builders

As we start writing new modules that rely on existing modules that we have written, the new tests may increasingly require the capability to set up initial state conditions for these dependent modules. To avoid having to redo this multiple times, it's better that each module always assumes their genesis configuration building to be a service for external consumption. This builder should not be in the mock module, as not really mocking, and its not even only for tests. Prior we have only had a builder for the entire test externality, but this builder level is not reusable, it must be the genesis configuration.

bedeho commented 4 years ago

Tests must assert events being emitted.

bedeho commented 4 years ago

StorageValues of type Option<_> not configurable

If they are, then then GenesisConfig will have _::default() value, which will almost always have some unintended semantic. Instead, if tests or genesis block require some explicit value, then it must be set manually. Instead, introduce a new extra genesis field which the storage module-level build functino can key off.

https://github.com/Joystream/substrate-runtime-joystream/pull/103/commits/438c031bfd429cc1f01e77ba8bf35c6c319dc138

shamil-gadelshin commented 4 years ago

I have a proposition regarding invariant asserts in substrate modules (on hiring example).

We have multiple entries of assert!(...) in our code. In worst case small error deeply rooted in the code will cause repeating crush of the whole platform (and we are not talking about malicious users yet).

Parity warns us about panics: https://substrate.dev/docs/en/tutorials/tcr/tcr-best-practices

Never panic - Your runtime module functions should not panic at all. Panics can make the chain prone to attacks. If a runtime function panics during execution, all state changes will be reverted. The user will not be charged anything for this execution. This can be repeated and thus can become an attack vector. The user can do a DoS attack on a panicking node because they won't be charged for any runtime execution that panics. The best approach is to detect situations where a panic may occur later and early-exit in a graceful manner to minimize computation and state inconsistencies. If it's impossible to detect a potential panic without first doing substantial computation, then ensure that the transactor first pays some fee (that can possibly be returned if all goes well) before that computation is done.

Consensys suggests the solution: https://consensys.github.io/smart-contract-best-practices/software_engineering/#circuit-breakers-pause-contract-functionality

Circuit Breakers (Pause contract functionality) Circuit breakers stop execution if certain conditions are met, and can be useful when new errors are discovered. For example, most actions may be suspended in a contract if a bug is discovered, and the only action now active is a withdrawal. You can either give certain trusted parties the ability to trigger the circuit breaker or else have programmatic rules that automatically trigger the certain breaker when certain conditions are met.

Example from hiring:

fn get_opt_stake_amount(stake_id: Option<T::StakeId>) -> BalanceOf<T> {
        stake_id.map_or(<BalanceOf<T> as Zero>::zero(), |stake_id| {
            // INVARIANT: stake MUST exist in the staking module
            assert!(<stake::Stakes<T>>::exists(stake_id));

            let stake = <stake::Stakes<T>>::get(stake_id);

            match stake.staking_status {
                // INVARIANT: stake MUST be in the staked state.
                stake::StakingStatus::Staked(staked_state) => staked_state.staked_amount,
                _ => panic!("stake MUST be in the staked state."),
            }
        })
    }

Two invariants in this small function:

If we got hierarchy:

Possible error in Stake logic will crush everything.

What if we limit the possible effect?

I suggest to introduce a concept of broken entity state.

In case of broken stake, stake and application become broken and should be discarded using the ensures. Other entities will remain active. We would have working current opening and other openings as well as remaining platform.

Worst case scenario for hiring - major invariant breaks - whole module becomes inactive as well as Content Directory Working Group module, but other modules continue to work!

We can check the invariant and in broken case save entity_id and the reason.

fn get_opt_stake_amount(stake_id: Option<T::StakeId>) -> Result<..> {
        stake_id.map_or(<BalanceOf<T> as Zero>::zero(), |stake_id| {

            ensure_not_broken!(stake_id);

        // INVARIANT: stake MUST exist in the staking module
            if (!<stake::Stakes<T>>::exists(stake_id)) {
                let reason = "stake {stake_id} does not exit";
                set_entity_as_broken_and_return_error!(stake_id, reason);
            }

            let stake = <stake::Stakes<T>>::get(stake_id);

            match stake.staking_status {
                // INVARIANT: stake MUST be in the staked state.
                stake::StakingStatus::Staked(staked_state) => Ok(staked_state.staked_amount),
                _ => {
                    let reason = "stake {stake_id} status is not Staked";
                    set_entity_as_broken_and_return_error!(stake_id, reason);
                }},
            }
        })
    }

Downsides:

shamil-gadelshin commented 4 years ago

Do we really need to ensure_* macros to be exported?

#[macro_export]
macro_rules! ensure_opt_unstaking_period_is_ok {...}

I didn't find any macro usage of hiring module in Content Director Working Group module.

As well as no stake::ensure_* calls in hiring module regarding to stake module macro exports.

I suggest to use this notation to use macro in lib.rs:

#[macro_use]
mod macroes;

And delete all unnecessary #[macro_export] directives.

It allows to reduce API surface for the module and improve module incapsulation.

Related question: https://stackoverflow.com/questions/26731243/how-do-i-use-a-macro-across-module-files

bedeho commented 4 years ago

I do not recall the use case for that particular macro, but if its not plausible that it be used in external dependency, such as a test or client code, then yeah just make non-public. I think this may have been me being sloppy because I did not take the time to clarify the details of macro visibility, I think it was quite nuanced.

Also in general, try to add entries that are suggestions for general DOs and DONTs, this post borders on using the issue just as a general QA thread. I think that there is no point to have a rule about this visibility issue: only things that are useful to the public should be visible. Whether this particular macro is so is not important.

shamil-gadelshin commented 4 years ago

I place it in the Coding Convention Suggestions because it affects not only hiring module, but other parts of the system as well. Stake module exports macro as well. It looks for me as general enough.

bedeho commented 4 years ago

bitflags! for key type of maps

For some reason, Substrate appears to not support encoding enums as map keys.

Rather than doing some manual fix, try using this macro

https://docs.rs/bitflags/1.2.1/bitflags/

bedeho commented 4 years ago

Do panics & asserts, but only in debug mode

There are a set of asserts that only are live in debug mode, specifically

Having asserts and panics in runtime code, but only active in debug mode, has two benefits

1.Clarity: It makes it easier for the reader to be aware of the relevant invariants at play. 2.Loud failure: it makes it more likely to detect bugs when running integration tests or similar complex regression testing.

While having panics is discouraged in production in the official Substrate docs, really only because it makes the system vulnerable to possible DoS attacks, this mixed approach of debug only asserts seems like a perfectly safe and productive convention.

This idea needs peer-review from Parity devs

bedeho commented 4 years ago

WIP: Struct fields of dynamic size should be lifted into top level storage maps

Note: Writing this down as a WIP convention, not totally sure yet.

Take the example where you have


struct Person {
..
 payload: Vec<u8>,
}

/// Storage
Persons get(persons): map Id => Perspon;

In this case, in particular if payload can get quite big over time, and even more so if payload is not read frequently in any extrinsics, then instead do


struct Person {
..
}

type PersonPayload = Vec<u8>;

/// Storage
Persons get(persons): map Id => Perspon;

PersonPayloadById get(person_payload_by_id) : map Id => PersonPayload

This is a few possible upsides

The main con is that reduces type safety by introducing a new invariatn tht must be maintained, namely that the key set of Persons and PersonPayloadById must be in synch.

siman commented 4 years ago

WIP: Struct fields of dynamic size should be lifted into top level storage maps

The next step would be to move payloads to offchain storage :) In this case, all advantages that @bedeho specified for XPayloadById would be even stronger.

shamil-gadelshin commented 4 years ago

Sounds reasonable.

But what if expected dynamic-size field is insignificant when compared to overall struct size? Should we always use this pattern to provide consistent approach to dynamic-size fields or there will be some empirical rules (limits), eg. if possible dynamic-size field size greater than 20 bytes then use pattern (or greater than size of remaining struct)?

bedeho commented 4 years ago

But what if expected dynamic-size field is insignificant when compared to overall struct size? Should we always use this pattern to provide consistent approach to dynamic-size fields or there will be some empirical rules (limits), eg. if possible dynamic-size field size greater than 20 bytes then use pattern (or greater than size of remaining struct)?

This was my initial thought also, however in most cases, the length of the field is controlled by a separate tunable storage parameter, in which case one cannot really have any guarantees at implementation time. I suppose the use case may have some very strong implications, say for example if it's a human-readable user name, that is unlikely to ever be more than 20 bytes perhaps. So I am not sure, perhaps we should have a convention which is sensitive to this.

The next step would be to move payloads to off-chain storage :) In this case, all advantages that @bedeho specified for XPayloadById would be even stronger.

This def. has very strong merit, it's just the practical complexity of making sure that you have robust off-chain infrastructure which will always have it available. Perhaps we can find some sort of general solution for this at a later time, e.g. using this when it is ready

https://github.com/playproject-io/datdot

Alternatively, we could just use the built-in storage system in Joystream, but I do worry about making that complicated in order to cover many disparate uses cases.

shamil-gadelshin commented 4 years ago

How to test on_finalize() in runtime modules

When we test on_finalize() or other methods in runtime modules sometimes we need to fast forward several blocks.

As Parity suggests we should create and use run_to_block() function:

fn run_to_block(n: u64) {
    while System::block_number() < n {
        ExampleModule::on_finalize(System::block_number());
        System::on_finalize(System::block_number());
        System::set_block_number(System::block_number() + 1);
        System::on_initialize(System::block_number());
        ExampleModule::on_initialize(System::block_number());
    }
}

on_finalize() invocation can be:

fn run_to_block_and_finalize(n: u64) {
    run_to_block(n);
    <Module as OnFinalize<u64>>::on_finalize(n);
}
shamil-gadelshin commented 4 years ago

Runtime module Cargo.toml header

I suggest to create a standard way of composing Cargo.toml header for runtime modules:

[package]
name = 'substrate-NAME-module'
version = '1.0.0'
authors = ['Joystream contributors']
edition = "2018"

Each module has separate versioning. Name should have 'substrate' prefix which is an indicator of reusability as substrate framework module.

shamil-gadelshin commented 4 years ago

Monorepo after Rome

We should start gathering runtime modules in monorepo after the Rome release. Dependencies between crates can be resolved with relative paths.

Example from Parity Substrate runtime.

juniuszhou commented 4 years ago

Define the all data member of struct as public in runtime module

When I implemented the data migration, I found I can't access some data member of struct defined in forum module because they are private. They are not accessible by other module.

juniuszhou commented 4 years ago

Name the runtime module with version

When I implemented the data migration, it is better to give the new forum name with version info. Since version not given in old module, I don't how to name the new one. Upgrade may be frequently happened in the future, so it is easier to track the changes.

bedeho commented 4 years ago

This has gotten too long, will summarize into a document to be introduced in this PR

https://github.com/Joystream/substrate-runtime-joystream/pull/169

, and then we can open a new Issue for new ideas.