Closed akoptelov closed 1 year ago
There are some crates available that take care of bin prot
encoding:
serde
serde
-based variant of binprot
above.Initially we can use one of these.
It is possible to get sexp-based shapes for all versioned types used by Mina, by using the following command:
mina.exe internal dump-type-shapes
Here is the output of that command, and this one is pretty-printed version of Block
type shape.
https://github.com/name-placeholder/mina-p2p-messages-rs/blob/main/block-shape-formatted.sexp
ChainSafe uses this JSON file as a schema for block, or transition, message, but it might be out of date with actual data structure.
I'm in the progress of code generation basing on JSON from above, but it looks like I need to use that sexp generated basing on up-to-date Mina sources instead. @tizoc do you know per chance how hard would it be to modify that command so sexp output includes more information, like type module name, and recursion markers?
I think it should be possible to use the current structure of that shapes after all. The idea is like this: each type has the "whole" shape, extended up to primitive types (or ones without shape), but we need to detect what versioned type is used e.g. as a record field type. We can do that basing on the part of the shape representing that field type, by searching for a type with exactly the same shape.
Consider the following shapes:
src/lib/transaction_snark/transaction_snark.ml:Transaction_snark.Pending_coinbase_stack_state.Stable.V1.t, 01b6150b8dbee028561cd4e372263a3f, (Exp
(Record
((source
(Exp
(Record
((data (Exp (Base kimchi_backend_bigint_32_V1 ())))
(state
(Exp
(Record
((init (Exp (Base kimchi_backend_bigint_32_V1 ())))
(curr (Exp (Base kimchi_backend_bigint_32_V1 ())))))))))))
(target
(Exp
(Record
((data (Exp (Base kimchi_backend_bigint_32_V1 ())))
(state
(Exp
(Record
((init (Exp (Base kimchi_backend_bigint_32_V1 ())))
(curr (Exp (Base kimchi_backend_bigint_32_V1 ()))))))))))))))
...
src/lib/mina_base/pending_coinbase.ml:Mina_base__Pending_coinbase.Stack_versioned.Stable.V1.t, 4c1a055e7620944ec41531887dfe7d6f, (Exp
(Record
((data (Exp (Base kimchi_backend_bigint_32_V1 ())))
(state
(Exp
(Record
((init (Exp (Base kimchi_backend_bigint_32_V1 ())))
(curr (Exp (Base kimchi_backend_bigint_32_V1 ()))))))))))
Here we can detect that the both fields of the Transaction_snark.Pending_coinbase_stack_state.Stable.V1.t
have the type Mina_base__Pending_coinbase.Stack_versioned.Stable.V1.t
.
If needed some extra processing can be added here to compare the shapes at each level to detect if some child is an already-seen shape https://github.com/MinaProtocol/mina/blob/1765ba6bdfd7c454e5ae836c49979fa076de1bea/src/app/cli/src/cli_entrypoint/mina_cli_entrypoint.ml#L1404
But in the end it is not very different from doing the same detection from the rendered result, there is really no extra information at this point that you don't have in the rendered version.
Implementing RPC decoding.
RPC get_epoch_ledger
is implemented. Working towards other RPCs and a kind of a registry for defined method.
The list of V1 RPCs:
get_some_initial_peers
get_staged_ledger_aux_and_pending_coinbases_at_hash
answer_sync_ledger_query
get_transition_chain
get_transition_chain_proof
Get_transition_knowledge
get_ancestry
ban_notify
get_best_tip
get_node_status
get_epoch_ledger
V2 Gossip messages are done. Now working on RPCs for V2.
For RPCs we can't have get_transition_chain_proof
for Mina V1 and V2 at the same time. The name and the version is the same while bin_prot encoding is different.
Adding bencmarking, for both native and wasm32.
Implementing memory consumption benchmarks.
Performance test Decoding of input messages, incoming RPCs captured during catch-up process in berkeleynet (~80M), x10 times.
Native:
mean: 0.26560021159999997, stdev: 0.0003018581737673776
Wasm32 (Firefox):
mean: 0.457054, stdev: 0.030821437443052
Wasm32 (Chrome):
mean: 0.4464769999, stdev: 0.055822759584718405
Wasm32 (Node):
mean: 0.4218, stdev: 0.011541959999999999
The summary is that there is a slight slow down when running as Wasm32, but it is less than x2.
Decoding of input messages, incoming RPCs captured during catch-up process in berkeleynet (~80M).
Native:
Ratio: 1.5793496061764019
Encoded size (N): 64115
Currently allocated (B): 0
Maximum allocated (B): 101260
Total amount of claimed memory (B): 101260
Total number of allocations: (N): 2376
Reallocations (N): 0
Wasm32 (Firefox):
Decoded bytes: 89779897
Ratio: 1.3257730848142988
Total Allocated (Bytes): 208338833
Currently Allocated (Bytes): 1024
Allocation Peak (Bytes): 119027771
Number of reallocations (N): 24
Still in-memory representation is 1.3-1.5 times bigger than encoded size. That is mostly because of enums with variants of different size -- all variants (even empty) require amount of memory suitable for the biggest variant.
BTW, before this fix memory consumption ratio was ~8 (runtime representation was >8 times bigger than encoded size), and wasm32 tests were 50% slower.
@akoptelov how did that change help? unless there is something I am missing, that code still allocates the same memory as before, plus space for the pointer to the bytes (which were inlined before), right?
@tizoc For simplicity, imagine this:
struct BigInt([u8;32]);
enum Option {
None,
Some(BigInt),
}
Each instance of the Option
will be 32 bytes, while in encoded form Empty
will be only 1 byte. So for this case we will have x32 increment. Now, if it would be like this,
struct BigInt(Box<[u8;32]>);
enum Option {
None,
Some(BigInt),
}
the Option
will be only 8 bytes long, so it will be 8x increment for the Option::None
decoding.
@akoptelov ahh I see. Thanks.
V2 wire types are generated without parameters now. Working on boxing enum variants if needed.
Heap allocations while decoding a GetStagedLedgerAuxAndPendingCoinbasesAtHashV2
RPC response.
Before boxing alts:
Ratio: 10.53691387685571
Encoded size (N): 19003707
Currently allocated (B): 199575280
Maximum allocated (B): 199575280
Total amount of claimed memory (B): 200240424
Total number of allocations: (N): 831141
Reallocations (N): 2897
With boxed alts:
Ratio: 2.063837071367181
Encoded size (N): 19003707
Currently allocated (B): 38649787
Maximum allocated (B): 38649787
Total amount of claimed memory (B): 39220555
Total number of allocations: (N): 840462
Reallocations (N): 2895
In a nutshell, the peaked (maximal allocated) size went from almost 200M down to 39M, what is only 2x times more than encoded size.
With preallocated vectors in binprot-rs
:
Ratio: 1.8691436886497987
Encoded size (N): 19003707
Currently allocated (B): 35513747
Maximum allocated (B): 35513747
Total amount of claimed memory (B): 35520659
Total number of allocations: (N): 837591
Reallocations (N): 24
Moved out security/safety items, they can be tracked elsewhere.
The fully functional library for Mina types is developed.
Definition of "done"
A Rust library developed that contains type definitions for all Mina gossip and RPC messages with possibility to encode/decode using
bin prot
schema.