near / nearcore

Reference client for NEAR Protocol
https://near.org
GNU General Public License v3.0
2.31k stars 618 forks source link

Research improvements to the fix 3317 #3348

Closed MaksymZavershynskyi closed 4 years ago

MaksymZavershynskyi commented 4 years ago

https://github.com/nearprotocol/nearcore/pull/3317 Separates deserialization of network messages from each other. Previously deserialization of transactions would steal resources from deserialization of other important things. This PR has placed deserialization associated with each peer into a separate actor. However, it is a partial solution. Potentially, we might want to have a separate actor for each pair (peer, message type), or even introduce some notion of prioritization into our current concurrency code. CC @pmnoxx , @bowenwang1996

The scope of this issue is research of potential improvements. After research is done and we have decided on specific solutions, implementation is going to be a separate issue.

mfornet commented 4 years ago

Also regarding #3317 we need a fix for the method is_forward_tx. The way it is implemented right now is too hacky and require manual intervention for some updates to the PeerMessage layout (for example the change introduced in #3088).

It also makes much more complex handling several versions at the same time during the upgradability process, to maintain backward compatibility between nodes with consecutive PROTOCOL_VERSIONS.

bowenwang1996 commented 4 years ago

@mfornet Do you have any suggestions here regarding is_forward_tx?

mfornet commented 4 years ago

@mfornet Do you have any suggestions here regarding is_forward_tx?

As suggested by @birchmd I think we should have this from the borsh side. It is not completely clear, but we can have some object that gives us a lazy view over a raw borsh serialization of an object. It should allow to access (deserialize) some particular fields (maybe deeply inside structs/enums) without needing to deserialize the full object, at most the prefix that contains the target field. If implemented correctly we can have a hint function (sort of unsafe) for those objects that we already know the size, and we can fully skip this fields. Ideally the operations performed by this lazy objects are the same performed today by is_forward_tx so we don't sacrifice too much performance. I've thought this for a while today and the trickiest part are enums (since each variant can be empty/tuple/anonymous struct) and we should have consistent getters for each type.

To give an example about how is_forward_tx would look like with this approach:

pub fn is_forward_tx(buffer: &mut &[u8]) -> Option<bool> {
      let lazy_peer_msg = LazyPeerMessage(buffer);
      lazy_peer_msg.variant_routed() // Option<LazyPeerMessageRoutedVariant>
               .and_then(|routed_variant| routed_variant.inner()) // Option<LazyRoutedMessage>
               .and_then(|routed_message| routed_message.body()) // Option<LazyRoutedMessageBody>
               .and_then(|routed_message_body| routed_variant.value()) // Option<LazyRoutedMessageBodyVariant>
         == Some(LazyRoutedMessageBodyVariant::ForwardTx)
}

This functions in the inside will only move the pointer of buffer forward while they are deserializing all the prefix to reach the destination. If some of they failed, either because we used a variant that is not the one serialized, or because the buffer is corrupt any of this function will return None (It can be Result too). All of this functions can be generated automatically using rust macros and they all can leverage hints whenever necessary to move faster through the data towards the goal.

I think this approach will get some time for full design and implementation, but on the good side it will give us some flexibility to apply heuristics/strategies that inspects messages before deserializing and gather relevant metadata when they are too frequent and too long without introducing any hacks that are hard to maintain.

birchmd commented 4 years ago

Closing this issue since the "research" is considered concluded for now. Another ticket will be created for implementing the lazy deserialization feature in Borsh.