Extract over-the-wire types into a separate library

mrmr1993 commented 2 years ago

Currently, the types that we send over the network and between processes are scattered across the codebase, bundled together with code that isn't necessarily useful to consumers of the type. This causes a lot of inter-dependence between libraries, and a lot of code to be linked into smaller / single-purpose executables -- in particular, the Mina signer and SnarkyJS.

We also have poor visibility into the end-to-end structure of the messages sent over the protocol, where discovering the actual data requires a lot of manual searching.

To combat these issues, we can factor out the 'over-the-wire' types into a dedicated library (e.g. Mina_wire_types) that contains these types only. To keep this library as minimal as possible, it would be best to avoid adding derivers to these types too, since they can be added via aliases later. For example, mina_wire_types.ml:

module Transaction = struct
  type transaction = ..
end

transaction.ml:

type t = Mina_wire_types.Transaction.t = .. [@@deriving foo, bar]

We also have some private/opaque types scattered about the protocol, most notably those from Currency. For these, it might make sense to have some kind of 'revealing' functor, e.g. mina_wire_types.ml:

module type Signature = sig
  module Mk : functor (A : sig type t end) -> sig module type S end
end

module Amount = struct
  type t = Unsigned.UInt64.t
end

module Mk_amount
    (Signature : Signature)
    (F : functor (A : sig type t = Unsigned.UInt64.t end) -> Signature.Mk(A).S)
  : Signature.Mk(Amount).S =
  F(Amount)

mina_wire_types.mli:

module type Signature = sig
  module Mk : functor (A : sig type t end) -> sig module type S end
end

module Amount : sig
  type t = private Unsigned.UInt64.t
end

module Mk_amount
    (Signature : Signature)
    (F : functor (A : sig type t = Unsigned.UInt64.t end) -> Signature.Mk(A).S)
  : Signature.Mk(Amount).S

amount.ml:

module Signature (A : sig type t end) = struct
  module type S = Intf.S with type t = t
end
include Mina_wire_types.Mk_amount
    (Signature)
    (functor (A : sig type t = Unsigned.UInt64.t end) -> struct
      type t = A.t [@@deriving foo, bar]
    end)

[x] Proof of concept for Currency.Amount.t: https://github.com/MinaProtocol/mina/pull/11256 & https://github.com/MinaProtocol/mina/pull/11326
[x] Apply design to all types up to Mina_transaction

robinbb commented 2 years ago

@Firobe You are the likely assignee of this work, and this work deemed to be of higher priority than the issues that you are currently working on, FYI. (This is because it is a blocker for other work which must start in ~2 weeks.)

Firobe commented 2 years ago

@mrmr1993 I have a few questions:

if I want to do this, how do I discover what types should end up here? I assume they are a subset of all versioned types. I see some modules have Wire submodules (or used to) but not always.
I understand your examples for private/abstract types but I fail to see the point of having such types in the library in the first place. In your example, no value of type Mina_wire_types.Amount.t can ever be constructed, so values obtained through Mk_amount won't be compatible with functions across the codebase expecting to consume/deconstruct a Mina_wire_types.Amount.t. Maybe it would be clearer if I had an idea of how you expect such private/abstract types would be used after being separated in their own library.
is the idea of the "revealing" functors just to check that the concrete type in amount.ml matches the one in mina_wire_types.ml? (but then, why even have concrete types mina_wire_types.ml with concrete types, except for documentation?).

Also a small remark, in your examples as is, functions cannot be derived here

(functor (A : sig type t = Unsigned.UInt64.t end) -> struct
      type t = A.t [@@deriving foo, bar]
    end)

since ppx_deriving is looking for A.pp (for show, for example) and isn't smart enough to follow the type equation to Unsigned.Uint64.pp. But this can probably be worked around via other means, depending on the kind of t.

mrmr1993 commented 2 years ago

if I want to do this, how do I discover what types should end up here?

It would be good to start with the gossip types, which are currently Network_pool.Transaction_pool.Diff_versioned.Stable.V2.t, Network_pool.Snark_pool.Diff_versioned.Stable.V2.t, and Mina_block.External_transition.Raw.Stable.V2.t (per here). Those types include most of the important data and structure in the protocol, but it would also be nice to also add RPC types, which top out as the request/response types in Mina_networking (e.g. here).

I understand your examples for private/abstract types but I fail to see the point of having such types in the library in the first place.

The point is, they are already private types. The idea of this library is that it will be a source of truth for types, but containing no code. Ideally, the only changes across the rest of codebase would be to add = Mina_wire_types.Foo.t to the type declarations along the path.

In your example, no value of type Mina_wire_types.Amount.t can ever be constructed, so values obtained through Mk_amount won't be compatible with functions across the codebase expecting to consume/deconstruct a Mina_wire_types.Amount.t

Those functions should still live in the Currency.Amount submodule. The intention of Mk_amount was to allow that code to be written over UInt32.t/UInt64.t as before, but then using that functor to make the top type equal to Mina_wire_types.Amount.t rather than making them opaque at the Currency.Amount level. In practice then, any opaque type that previously had a constructor will still have one, buf the opaque type itself has been lowered to the Mina_wire_types library.

Also a small remark, in your examples as is, functions cannot be derived here

...

since ppx_deriving is looking for A.pp (for show, for example) and isn't smart enough to follow the type equation to Unsigned.Uint64.pp.

Indeed, using Unsigned.Uint64.t (or indeed Unsigned_extended.Uint64.t) in the body of the functor is the correct way to give ppx_deriving the correct paths. The code will already be doing this, so this should only really mean wrapping it in a functor to pass to Mk_amount and friends.

Since Currency.Amount.t is essentially a base type in the Mina hierarchy, it probably makes sense to start with a PR attempting that as an experiment.

Firobe commented 2 years ago

I've settled on a design similar to yours and got no problems unifying abstract and concrete types during experiments on a proof of concept. I'm now working on a PR for Currency.Amount

Firobe commented 2 years ago

Proof of concept for Currency.Amount is validated and now merged: https://github.com/MinaProtocol/mina/pull/11256

Now working on applying the same design to the rest of the types

Firobe commented 2 years ago

Full list (for compatible) of the types needed

?: children not visited yet %: already visited children $: repeat prefix above =: directly equal to parent

Leafs only depend on native (or external) types

Compatible

Expand list

Network_pool.Transaction_pool.Diff_versioned.V1.t - Mina_base.User_command.t - $.$.Poly.Stable.V1.t - $.Signed_command.Stable.V1.t - $.$.Poly.Stable.V1.t - $.Signed_command_payload.Stable.V1.t - $.$.Poly.Stable.V1.t - $.$.Common.Stable.V1.t - $.$.$.Poly.Stable.V1.t - Currency.Fee.Stable.V1.t - Public_key.Compressed.Stable.V1.t - Non_zero_curve_point.Compressed.Stable.V1.t - $.Compressed_poly.Stable.V1.t - Snark_params.Tick.Field.t = Crypto_params.Tick.Field.t = Pickles.Impls.Step.Internal_basic.Field.t = Zexe_backend.Pasta.Vesta_based_plonk.Field.t = Marlin_plonk_bindings.Pasta_fp.t - $.Token_id.Stable.V1.t - Mina_numbers.Nat.Make64.Stable.V1.t - Mina_numbers.Account_nonce.Stable.V1.t - Mina_numbers.Global_slot.Stable.V1.t - $.Signed_command_memo.Stable.V1.t - $.$.Body.Stable.V1.t - $.Payment_payload.t - Poly.Stable.V1.t - Currency.Amount.Stable.V1.t - % - $.Stake_delegation.t - % - $.New_token_payload.t - % - $.New_account_payload.t - % - $.Minting_payload.t - % - Public_key.Stable.V1.t - Non_zero_curve_point.Stable.V1.t - % - $.Signature.Stable.V1.t - Snark_params.Tick.Inner_curve.t = Pickles.Backend.Tick.Inner_curve.t = Zexe_backend.Pasta.Pallas.t = Curve.Make (Fp) (Fq) (Params) (Pasta_pallas) = Marlin_plonk_bindings.Pasta_pallas.t - % - $.Snapp_command.Stable.V1.t - $.$.Inner.Stable.V1.t - $.Other_fee_payer.Stable.V1.t - $.$.Payload.Stable.V1.t - % - % - $.$.Party.Authorized.Proved.Stable.V1.t - $.$.$.Poly.Stable.V1.t - $.$.$.Predicated.Proved.Stable.V1.t - $.$.$.Body.Stable.V1.t - Update.Stable.V1.t - Pickles.Backend.Tick.Field.Stable.V1.t % - Set_or_keep.Stable.V1.t - Pickles.Side_loaded.Verification_key.Stable.V1.t - $.Backend.Tock.Curve.Affine.t = Marlin_plonk_bindings.Pasta_pallas.t % - $.$.Vk.t - $.Impls.Wrap.Verification_key.t - % - $.$.$.Poly.Stable.V1.t - With_hash.Stable.V1.t - Mina_base.Permissions.Stable.V1.t - $.$.Auth_required.Stable.V1.t - $.$.Poly.Stable.V1.t Poly.Stable.V1.t - Sgn.Stable.V1.t - Signed_poly.Stable.V1.t - $.$.$.$.Poly.Stable.V1.t - % - $.Snapp_predicate.Stable.V1.t - Account.Stable.V1.t - Currency.Balance.Stable.V1.t - Numeric.Stable.V1.t - Closed_interval.Stable.V1.t - Or_ignore.Stable.V1.t - $.Receipt.Chain_hash.Stable.V1.t % - Eq_data.Stable.V1.t = Or_ignore.Stable.V1.t - Poly.Stable.V1.t - % - Protocol_state.Stable.V1.t - Frozen_ledger_hash.Stable.V1.t % - Epoch_ledger.Poly.Stable.V1.t - Epoch_seed.Stable.V1.t % - State_hash.Stable.V1.t % - Length.Stable.V1.t - Poly.Stable.V1.t - Other.Stable.V1.t - $.$.Poly.Stable.V1.t - % - $.$.$.$.Poly.Stable.V1.t - $.Control.Stable.V1.t - Pickles.Side_loaded.Proof.Stable.V1.t - Verification_key.Max_width.n (NAT MACHINERY) - $.Proof.t - Proof.With_data.t - Base.Me_only.Dlog_based.t - Tock.Inner_curve.Affine.t (to check again) - Marlin_plonk_bindings.Pasta_fq.t - Challenge.Constant.t - Scalar_challenge.t - Bulletproof_challenge.t - Wrap_bp_vec.t (NAT MACHINERY) - Challenges_vector.t % - Vector.t (NAT MACHINERY) - Dlog_based.Proof_state.Me_only.t - Step_bp_vec.t (NAT MACHINERY) - Base.Me_only.Pairing_based.t - % - $.$.Party.Authorized.Empty.Stable.V1.t - $.$.$.Predicated.Empty.Stable.V1.t - % - % - $.$.Party.Authorized.Signed.Stable.V1.t - $.$.$.Predicated.Signed.Stable.V1.t - % Network_pool.Snark_pool.Diff_versioned.Stable.V1.t - Transaction_snark_work.Statement.Stable.V1.t - Ledger_proof.Stable.V1.t - Transaction_snark.Stable.V1.t - Statement.With_sok.Stable.V1.t - Pending_coinbase_stack_state.Stable.V1.t - Pending_coinbase.Stack_versioned.Stable.V1.t - Coinbase_stack.Stawantble.V1.t - State_stack.Stable.V1.t - Stack_hash.Stable.V1.t - Poly.Stable.V1.t - Poly.Stable.V1.t - Fee_excess.Stable.V1.t - Poly.Stable.V1.t - % - Sok_message.Digest.Stable.V1.t - Poly.Stable.V1.t - % - Proof.Stable.V1.t - Pickles.Proof.Branching_2.Stable.V1.t ? - One_or_two.Stable.V1.t - Priced_proof.Stable.V1.t - Fee_with_prover.Stable.V1.t - % - Core.Time.Stable.With_utc_sexp.V2.t - Transaction_snark_work.Statement.Stable.V1.Table.t - % Mina_block.External_transition.Raw.Stable.V1.t - Protocol_state.Value.Stable.V1.t - Body.Value.Stable.V1.t - Blockchain_state.Value.Stable.V1.t - Staged_ledger_hash.Stable.V1.t - Non_snark.Stable.V1.t - Ledger_hash.Stable.V1.t - Aux_hash.Stable.V1.t - Pending_coinbase_aux.Stable.V1.t - Pending_coinbase.Hash_versioned.Stable.V1.t - Poly.Stable.V1.t - Block_time.Stable.V1.t - Poly.Stable.V1.t - Consensus.Data.Consensus_state.Value.Stable.V1.t - Length.Stable.V1.t - Vrf.Output.Truncated.Stable.V1.t - Global_slot.Stable.V1.t - Mina_numbers.Global_slot.Stable.V1.t - Epoch_data.Staking_value_versioned.Value.Stable.V1.t - Epoch_ledger.Value.Stable.V1.t - Frozen_ledger_hash0.Stable.V1.t - Poly.Stable.V1.t - Lock_checkpoint.Stable.V1.t - Poly.Stable.V1.t - Epoch_data.Next_value_versioned.Value.Stable.V1.t % - Poly.Stable.V1.t - Protocol_constants_checked.Value.Stable.V1.t - Poly.Stable.V1.t - % - Poly.Stable.V1.t - Poly.Stable.V1.t - Staged_ledger_diff.Stable.V1.t - Diff.t - Pre_diff_with_at_most_two_coinbase.t - Transaction_snark_work.t - % - With_status.t - Transaction_status.Stable.V1.t - Auxiliary_data.Stable.V1.t - Balance_data.Stable.V1.t - Failure.Stable.V1.t - Pre_diff_two.t - Coinbase.Fee_transfer.t - At_most_two.t - Transaction_status.Internal_command_balance_data.t - Coinbase_balance_data.Stable.V1.t - Fee_transfer_balance_data.Stable.V1.t - Pre_diff_with_at_most_one_coinbase.t - At_most_one.t - State_body_hash.Stable.V1.t - Protocol_version.Stable.V1.t - Validate_content.t

Develop

Expand list

Network_pool.Transaction_pool.Diff_versioned.Stable.V2.t ? Network_pool.Snark_pool.Diff_versioned.Stable.V2.t ? Mina_block.External_transition.Raw.Stable.V2.t ?

Firobe commented 2 years ago

As discussed with Matthew, I'm restricting the scope of this issue to all types up to those defined in Mina_transaction. I'll open a new issue for the other ones.

Firobe commented 2 years ago

Continuation of this issue for gossip types in https://github.com/MinaProtocol/mina/issues/11771

MinaProtocol / mina

Extract over-the-wire types into a separate library #11101

Compatible

Develop