openmina / mina-p2p-messages-rs

0 stars 0 forks source link

As a Developer, I need to have a way to encode/decode Mina messages in Rust so I can implement an independent Mina peer #1

Closed akoptelov closed 1 year ago

akoptelov commented 2 years ago

Definition of "done"

A Rust library developed that contains type definitions for all Mina gossip and RPC messages with possibility to encode/decode using bin prot schema.

akoptelov commented 2 years ago

There are some crates available that take care of bin prot encoding:

Initially we can use one of these.

akoptelov commented 2 years ago

It is possible to get sexp-based shapes for all versioned types used by Mina, by using the following command:

mina.exe internal dump-type-shapes

Here is the output of that command, and this one is pretty-printed version of Block type shape. https://github.com/name-placeholder/mina-p2p-messages-rs/blob/main/block-shape-formatted.sexp

akoptelov commented 2 years ago

ChainSafe uses this JSON file as a schema for block, or transition, message, but it might be out of date with actual data structure.

akoptelov commented 2 years ago

I'm in the progress of code generation basing on JSON from above, but it looks like I need to use that sexp generated basing on up-to-date Mina sources instead. @tizoc do you know per chance how hard would it be to modify that command so sexp output includes more information, like type module name, and recursion markers?

akoptelov commented 2 years ago

I think it should be possible to use the current structure of that shapes after all. The idea is like this: each type has the "whole" shape, extended up to primitive types (or ones without shape), but we need to detect what versioned type is used e.g. as a record field type. We can do that basing on the part of the shape representing that field type, by searching for a type with exactly the same shape.

Consider the following shapes:

src/lib/transaction_snark/transaction_snark.ml:Transaction_snark.Pending_coinbase_stack_state.Stable.V1.t, 01b6150b8dbee028561cd4e372263a3f, (Exp
 (Record
  ((source
    (Exp
     (Record
      ((data (Exp (Base kimchi_backend_bigint_32_V1 ())))
       (state
        (Exp
         (Record
          ((init (Exp (Base kimchi_backend_bigint_32_V1 ())))
           (curr (Exp (Base kimchi_backend_bigint_32_V1 ())))))))))))
   (target
    (Exp
     (Record
      ((data (Exp (Base kimchi_backend_bigint_32_V1 ())))
       (state
        (Exp
         (Record
          ((init (Exp (Base kimchi_backend_bigint_32_V1 ())))
           (curr (Exp (Base kimchi_backend_bigint_32_V1 ()))))))))))))))
...
src/lib/mina_base/pending_coinbase.ml:Mina_base__Pending_coinbase.Stack_versioned.Stable.V1.t, 4c1a055e7620944ec41531887dfe7d6f, (Exp
 (Record
  ((data (Exp (Base kimchi_backend_bigint_32_V1 ())))
   (state
    (Exp
     (Record
      ((init (Exp (Base kimchi_backend_bigint_32_V1 ())))
       (curr (Exp (Base kimchi_backend_bigint_32_V1 ()))))))))))

Here we can detect that the both fields of the Transaction_snark.Pending_coinbase_stack_state.Stable.V1.t have the type Mina_base__Pending_coinbase.Stack_versioned.Stable.V1.t.

tizoc commented 2 years ago

If needed some extra processing can be added here to compare the shapes at each level to detect if some child is an already-seen shape https://github.com/MinaProtocol/mina/blob/1765ba6bdfd7c454e5ae836c49979fa076de1bea/src/app/cli/src/cli_entrypoint/mina_cli_entrypoint.ml#L1404

But in the end it is not very different from doing the same detection from the rendered result, there is really no extra information at this point that you don't have in the rendered version.

akoptelov commented 2 years ago

Implementing RPC decoding.

akoptelov commented 2 years ago

RPC get_epoch_ledger is implemented. Working towards other RPCs and a kind of a registry for defined method.

akoptelov commented 2 years ago

The list of V1 RPCs:

akoptelov commented 2 years ago

RPC doc: https://github.com/name-placeholder/mina-wiki/blob/master/p2p/mina-rpc.md

akoptelov commented 1 year ago

V2 Gossip messages are done. Now working on RPCs for V2.

akoptelov commented 1 year ago

For RPCs we can't have get_transition_chain_proof for Mina V1 and V2 at the same time. The name and the version is the same while bin_prot encoding is different.

https://github.com/MinaProtocol/mina/discussions/11860

akoptelov commented 1 year ago

Adding bencmarking, for both native and wasm32.

akoptelov commented 1 year ago

Implementing memory consumption benchmarks.

akoptelov commented 1 year ago

Performance test Decoding of input messages, incoming RPCs captured during catch-up process in berkeleynet (~80M), x10 times.

Native:

mean: 0.26560021159999997, stdev: 0.0003018581737673776 

Wasm32 (Firefox):

    mean: 0.457054, stdev: 0.030821437443052

Wasm32 (Chrome):

    mean: 0.4464769999, stdev: 0.055822759584718405

Wasm32 (Node):

mean: 0.4218, stdev: 0.011541959999999999

The summary is that there is a slight slow down when running as Wasm32, but it is less than x2.

akoptelov commented 1 year ago

Memory Consumption Test

Decoding of input messages, incoming RPCs captured during catch-up process in berkeleynet (~80M).

Native:

Ratio: 1.5793496061764019
Encoded size (N): 64115
Currently allocated (B): 0
Maximum allocated (B): 101260
Total amount of claimed memory (B): 101260
Total number of allocations: (N): 2376
Reallocations (N): 0

Wasm32 (Firefox):

Decoded bytes: 89779897
Ratio: 1.3257730848142988
Total Allocated (Bytes): 208338833
Currently Allocated (Bytes): 1024
Allocation Peak (Bytes): 119027771
Number of reallocations (N): 24

Still in-memory representation is 1.3-1.5 times bigger than encoded size. That is mostly because of enums with variants of different size -- all variants (even empty) require amount of memory suitable for the biggest variant.

akoptelov commented 1 year ago

BTW, before this fix memory consumption ratio was ~8 (runtime representation was >8 times bigger than encoded size), and wasm32 tests were 50% slower.

tizoc commented 1 year ago

@akoptelov how did that change help? unless there is something I am missing, that code still allocates the same memory as before, plus space for the pointer to the bytes (which were inlined before), right?

akoptelov commented 1 year ago

@tizoc For simplicity, imagine this:

struct BigInt([u8;32]);

enum Option {
    None,
    Some(BigInt),
}

Each instance of the Option will be 32 bytes, while in encoded form Empty will be only 1 byte. So for this case we will have x32 increment. Now, if it would be like this,

struct BigInt(Box<[u8;32]>);

enum Option {
    None,
    Some(BigInt),
}

the Option will be only 8 bytes long, so it will be 8x increment for the Option::None decoding.

tizoc commented 1 year ago

@akoptelov ahh I see. Thanks.

akoptelov commented 1 year ago

18 is for make run-time size closer to encoded one.

19 is to make it simpler.

akoptelov commented 1 year ago

An update: https://github.com/name-placeholder/mina-p2p-messages-rs/issues/19#issuecomment-1274374160

akoptelov commented 1 year ago

V2 wire types are generated without parameters now. Working on boxing enum variants if needed.

akoptelov commented 1 year ago

Here's the diff with boxed types: https://github.com/name-placeholder/mina-p2p-messages-rs/commit/6170d505c56281333b15402af463d31dc23d1113#diff-ef1be191565e6cf1df478c38eb9981438a2747fcf31c3d902a196ddccb13b853

akoptelov commented 1 year ago

Heap allocations while decoding a GetStagedLedgerAuxAndPendingCoinbasesAtHashV2 RPC response.

Before boxing alts:

Ratio: 10.53691387685571
Encoded size (N): 19003707
Currently allocated (B): 199575280
Maximum allocated (B): 199575280
Total amount of claimed memory (B): 200240424
Total number of allocations: (N): 831141
Reallocations (N): 2897

With boxed alts:

Ratio: 2.063837071367181
Encoded size (N): 19003707
Currently allocated (B): 38649787
Maximum allocated (B): 38649787
Total amount of claimed memory (B): 39220555
Total number of allocations: (N): 840462
Reallocations (N): 2895

In a nutshell, the peaked (maximal allocated) size went from almost 200M down to 39M, what is only 2x times more than encoded size.

With preallocated vectors in binprot-rs:

Ratio: 1.8691436886497987
Encoded size (N): 19003707
Currently allocated (B): 35513747
Maximum allocated (B): 35513747
Total amount of claimed memory (B): 35520659
Total number of allocations: (N): 837591
Reallocations (N): 24
akoptelov commented 1 year ago

Moved out security/safety items, they can be tracked elsewhere.

akoptelov commented 1 year ago

The fully functional library for Mina types is developed.