Bootstrap mocked-and-sync follow-up

Ben-PH commented 1 year ago

[x] Backoff spin when peeking messages
[x] coordinate with peer-net efforts
[ ] implement (de)serializer for reading/writing messages to the duplex.

Ben-PH commented 1 year ago

Followup to #3745

Ben-PH commented 1 year ago

@dr-chain I've changed tho code to peek a stream of bytes to first read the known-length portion of the stream, which contains the len-value of the variable length part of the stream. I've set up this code:

        // this code is part of the bootstrap process, where the client/server is processing what it expects to be
        // the next message to be received. We use TcpStream. Will link to code-base at bottom
        let peek_len = HASH_SIZE_BYTES + self.size_field_len;
        let mut peek_buf = vec![0; peek_len];
        while self.duplex.peek(&mut peek_buf)? < peek_len {
            // TODO: backoff spin of some sort
        }

...note, at time of writing the todo, I didn't think about the fact that this is impossible to be a spin-lock due to the fact that peek is a syscall, which context-switches anyway. My concern is management of partial peeks. In a perfect world, it will either block, or will get everything needed. No problem. Currently, this "shouldn't" be an issue, but bugs, pathological edge cases, and attackers exist...

Would be interested in your input on how to handle this, I'm thinking:

Fail if a peek is partial: return and log an error
have a certain number of attempts (as a sort of variant of the previous
have some sort of time-constraint.
???

What would be an appropriate approach in your mind?

https://github.com/massalabs/massa/blob/testnet_22/massa-bootstrap/src/client_binder.rs#L74-L86

https://github.com/massalabs/massa/blob/testnet_22/massa-bootstrap/src/server_binder.rs#L221-L231

damip commented 1 year ago

Repeated peek is not always reliable. Good practice is to read the data normally (not peek). Sync streams like std::TcpStream have timeout capabilities built-in. The use cases of peek are actually very rare

dr-chain commented 1 year ago

@Ben-PH Can you please explain the motivations for using peek instead of read with timeout? I agree with @damip on this but maybe we are missing something?

Ben-PH commented 1 year ago

Can see some discussion on it already here: https://github.com/massalabs/massa/pull/3774

The general direction is that the bytes of the stream are consumed atomically*, such that the behavior is either "the sent message is consumed in its entirety, or not at all". Using peek allows the length of the message to be known before consumption begins.

*that's not actually the case at the moment: read_exact isn't implemented in such a way, it's really just a convenience wrapper around a read-loop.

dr-chain commented 1 year ago

@damip Your inputs? @Ben-PH Don't we always pass the length first and then the message when reading buffers?

Ben-PH commented 1 year ago

signature/hash and message length. These two make up the part of the message that is of known length. if we read without consuming (i.e. peek), we can learn the length of the message in its entirety, before any reading actually happens.

dr-chain commented 1 year ago

@Ben-PH what is the benefit of it? If I read normally the message length, I already know how long the message is. I think I am missing something here.

Ben-PH commented 1 year ago

First, you need to read the bytes that give you the length of the message. They are not part of the actual message itself.

Before the peek: the number of bytes you need to consume could be anywhere between n and n + max msg bytes inclusive, with n being known at compile time.

After the peek: the number of bytes you need to consume is a known value between n and n + max msg bytes inclusive.

damip commented 1 year ago

Given our framing, the length prefix is always of fixed size, so we know exactly what to read and we want to consume that prefix to then read the contents of the message. So using read_exact is the right thing to do.

self.duplex.set_read_timeout(duration)?;

let read_len: usize = HASH_SIZE_BYTES + self.size_field_len;

let mut buf = vec![0; read_len];
if let Err(err) = self.duplex.read_exact(&mut buf) {
    // error handling
}

// see how read_exact is implemented here: https://doc.rust-lang.org/stable/src/std/io/mod.rs.html#451-468

// here buf is filled and the first read_len bytes of the socket buffer are consumed

But the timeout is inaccurate. If one of the read operations in the loop of read_exact times out, then read_exact returns an error. But this does not guarantee that the whole read_exact takes the expected time: it can take much more if data arrives slowly but individual read() calls don't timeout.

Here is a custom implem that is more accurate:

fn read_exact_timeout(stream: &mut TcpStream, mut buf: &mut [u8], timeout: Duration) -> io::Result<()> {
    let start_time = Instant::now();
    let saved_timeout = stream.read_timeout()?;
    let res = 'rb: {
        while !buf.is_empty() {
            let remaining_timeout = match timeout.checked_sub(start_time.elapsed()) {
                Some(remaining_timeout) => remaining_timeout,
                None => break 'rb Err(io::Error::new(ErrorKind::TimedOut, "exact read timed out"))
            };
            if let Err(err) = stream.set_read_timeout(Some(remaining_timeout)) {
                break 'rb Err(io::Error::new(ErrorKind::UnexpectedEof, "could not set stream timeout"));
            }
            match stream.read(buf) {
                Ok(0) => break,
                Ok(n) => {
                    let tmp = buf;
                    buf = &mut tmp[n..];
                }
                Err(ref e) if e.kind() == ErrorKind::Interrupted => {}
                Err(e) => break 'rb Err(e),
            }
        }
        if !buf.is_empty() {
            break 'rb Err(io::Error::new(ErrorKind::UnexpectedEof, "failed to fill whole buffer"))
        } else {
            break 'rb Ok(())
        }
    };
    let reset_res = stream.set_read_timeout(saved_timeout);
    res.and(reset_res)
}

Ben-PH commented 1 year ago

I'm aware of the read_exact, and it's been on my mind to use a read-loop similar to what you put up (my bad, it should've been in the overall tracking issue for awhile now.).

I posed the question here, because it was about the specific TODO that formed part of this follow-up. I think if I get started on the read_exact issue, the ideal solution to this question will reveal itself.

Ben-PH commented 1 year ago

@damir #3885 might be of interest to you.

damip commented 1 year ago

@damir #3885 might be of interest to you.

I still disagree with the need to use peek, we can discuss it

Ben-PH commented 1 year ago

Reverting the peek isn't problematic.

My thinking is along the lines that it would be good for there to be a tight coupling between the state of the data stream, and the representation of the connection. Without the peek, the data in the stream "changes from under your feet", so to speak. When we were dealing with tokio and cancellations, this could have been a maintenance pain-point. Less so the case now. The read-exact, though, is problematic for reasons you've outlined (and one or two other things).

The approach using peek is an attempt to ensure no changes are made to the data in the stream without knowing the consequences of the change. This gives a structure to the code that allows for us to get ahead of any potential issues.

If we follow the assumption that it's worth developing further, a better approach would probably be to consume the stream into the connection binding struct, and interpret that directly. Come to think of it, that might be good for performance, as we get to re-use memory allocations. Not much of a gain here, but for long-living connections, maybe.

It any case, the broader changes to the bootstrap code-base makes the idea of all this less compelling, and would be happy to revert the use of peek.

dr-chain commented 1 year ago

@Ben-PH For example, if we consider a given message changes without using peek, this processing of the changed buffer may fail at the respective module, which is ok, I think.

As @damip explained earlier to me, the use of peek can be justified in 2 cases- 1. When data needs some preprocessing or 2. We need to estimate something without actually processing it.

I agree with you regarding the broader changes to the bootstrap code-base and I think we can safely revert the use of peek :) Maybe @damip can add some points here if required.

Ben-PH commented 1 year ago

I've reverted the use of peek, and cleaned up the process of read-exact

Ben-PH commented 1 year ago

@Eitu33 I've looked into (de)serializer for the send/receive message construction.

A big sticking point is it's not just serialization happening, as the process also involves a state-change of the binding, as well as logic-control based on the binding state (prev_message).

It could be made to work, but it feels a bit like trying to force the use of the serializer as a tool in a use-case that is not-quite-right

to illustrate:

pub struct BootstrapServerCompleteMessageSerializer {
    msg_serializer: BootstrapServerMessageSerializer,
}
pub struct BootstrapServerCompleteMessageBuilder {
    msg: BootstrapServerMessage,
    msg_len: u32,
    prev_msg_hash: Option<massa_bootstrap::Hash>,
}

impl Serializer<BootstrapServerCompleteMessageBuilder>
    for BootstrapServerCompleteMessageSerializer
{
    fn serialize(
        &self,
        value: &BootstrapServerCompleteMessageBuilder,
        buffer: &mut Vec<u8>,
    ) -> Result<(), SerializeError> {
        // ideally this would go directly into the passed in vec after populating it with the signature and length,
        // but we don't have a way to use the msg-body serializer in such a way
        let mut msg_bytes = Vec::new();
        self.msg_serializer.serialize(&value.msg, &mut msg_bytes)?;
        let msg_len: u32 = msg_bytes.len().try_into().map_err(|e| {
            SerializeError::GeneralError(format!("bootstrap message too large to encode: {}", e))
        })?;

        let sig = {
            if let Some(prev_message) = self.prev_message {
                // there was a previous message: sign(prev_msg_hash + msg)
                let mut signed_data = vec![];
                signed_data.extend(prev_message.to_bytes());
                signed_data.extend(&msg_bytes);
                self.local_keypair.sign(&Hash::compute_from(&signed_data))?
            } else {
                // there was no previous message: sign(msg)
                self.local_keypair.sign(&Hash::compute_from(&msg_bytes))?
            }
        };
        buffer.extend_from_slice(value.signature.to_bytes().as_slice());
        buffer.extend_from_slice(&msg_len.to_be_bytes_min(MAX_BOOTSTRAP_MESSAGE_SIZE)?);

        buffer.extend_from_slice(msg_bytes.as_slice());
        Ok(())
    }

This is pretty close to a plane "move the code into a dedicated Serializer implementation, but now we need to update the servers previous-message state, and I don't see how that can be done in the serializer itself, or to pass back data to run the update in a non-hackey way.

...we could implement the serializer on the connection binding itself, and that could get things closer, but we run into the problem of &self, not &mut self.

I'm sure it could be done, but at this point, the time it would take, and the benefits to be had, don't seem to add up.

I could, of course, be missing something.

Do you have any thoughts on how to proceed?

Eitu33 commented 1 year ago

@damip @Ben-PH I believe putting the send / receive logic in ser & deser became mandatory with the arrival of the first versioned objects. For context we currently don't serialize a size but instead we read data based on constants, see here for example. This was not that big of an issue before but what was previously a constant will now have a different value depending on the version of signatures we are reading (signature in this case but will be the case too for other objects if that ever was to happen). This leaves us no choice but to handle this through nom deserializers as soon https://github.com/massalabs/massa/pull/3876 will be merged, if we don't want to have the bootstrap system running on constants that do not directly match our object serialization size.

They are 2 solutions here, first is handling all the logic in a ser / deser system, might take some time now but will be easier to maintain in the future. Second is a minimal solution where we only use the versioned signature ser / deser (which are already available) and leave the rest of the logic as it is, given that it's the only thing that will be posing an issue after https://github.com/massalabs/massa/pull/3876. This one is mandatory but pretty easy to implement.

dr-chain commented 1 year ago

@Eitu33 I want to add my 2 cents. If we are serializing everything in binary (which I believe we are), then we can use the most used method i.e., TLV (see here and here). It is fast and efficient and supports encapsulation too. This way, a single global serializer and deserialiser can be used for the entire massa node operations.

@Ben-PH regarding your query of passing previous states, that can be added based on the TAG value as some specific function responsible for pre/post-processing. Most binary communications with cryptography or not-so-powerful systems use TLV. I do understand we do not encode lengths right now and instead depend on the constants but as we go ahead and support more and more things, it will be so many constants.

@Eitu33 The versioning system you explained can also be easily implemented with TLV where your version byte(s) is/are basically a TAG which encodes the type of signature (for example) and its length too. In this case, the L will not be relevant but can be used to check the data veracity.

I also liked @damip's insistence on maintaining the modularity of the code. So, ideally, our global serializer/deserializer should just only do that and all the pre/post-processing required (like passing previous states or updating entries, etc.) should be handled separately.

I am not completely aware about the codebase yet but I just want to bring in some outsider's perspective here :)

Eitu33 commented 1 year ago

@Eitu33 The versioning system you explained can also be easily implemented with TLV where your version byte(s) is/are basically a TAG which encodes the type of signature (for example) and its length too. In this case, the L will not be relevant but can be used to check the data veracity.

Yes this is exactly what is currently done in the versioned objects deserializers

AurelienFT commented 6 months ago

The tests of bootstrap has been totally revamp with the new unit test universes. I close this.

massalabs / massa

Bootstrap mocked-and-sync follow-up #3842