tact1m4n3 / crsf-rs

A work in progress crsf protocol packet parser
MIT License
10 stars 7 forks source link

Discussion on various suggestions (RawPacket, validation, sync byte, Iterator, Payload trait) #27

Open peterkrull opened 3 months ago

peterkrull commented 3 months ago

Just dumping some ideas I have had in the past few days

RawPacket, owned vs ref

I think we can make everyone happy with regards to the RawPacket, by having the parser use a RawPacket as a buffer, which is then returned by reference from push_bytes. This can then be cloned if the user wants an owned instance. For simplicity I would just want it to look like:

pub struct RawPacket {
    buf: [u8; CRSF_MAX_LEN],
    len: usize,
}

I also discovered this pattern match which is pretty handy:

impl RawPacket {
    fn payload(&self) -> Result<&[u8], Error> {
        match (self.is_extended(), self.as_slice()) {
            // Skip the [sync], [len], [type], [src], [dst] and [crc] bytes
            (true, [_, _, _, _, _, payload @ .., _]) => Ok(payload),
            // Skip the [sync], [len], [type] and [crc] bytes
            (false, [_, _, _, payload @ .., _]) => Ok(payload),
            _ => Err(Error::BadBufferLen),
        }
    }
}

Always validating raw packets

I think during push_byte we should verify the sync byte (of course), length, type (*optionally?) and crc. This should ensure we do not return an invald packet. This would have little overhead, since we essentially just moved when the crc check is done. This assumes that even a crsf proxy/router would not want to pass along corrupted data.

(*) Type validation could be optional to allow for routing of unknown/custom packet types? (but then how would it know where to route it? idk..)

And since the RawPacket has already been validated, actually parsing is just a matter of extracting the payload slice (with the snippet above) and calling the appropriate parsing function for the type.

Choosing custom sync byte

It seems that changing to detecting some other (or even multiple others) sync byte(s) is quite niche. My idea is to allow chainging the behavior of the push_bytes method using a config through push_bytes_with_cfg, where push_bytes just calls with a sensible default configuration. This would allow changing the sync byte to some other value (and whether to validate the type byte?). This means that the parser is only ever capable of looking for one sync byte, but nothing is stopping the user from creating two (or more) parsers, each looking for their own sync byte if needed. On the plus side, this also makes it less likely for any single parser to "mistakenly" start parsing early and thus miss an actual sync byte.

Implementing Iterator

I think the nicest interface for extracting either a RawPacket or Packet from some buffer would be using iterators. This will take and consume + process the entire buffer, returning result as many times as needed with an owned RawPacket or Packet.

let mut bytes: [u8; 52] = [ // Two packets received at once
    200, 24, 22, 215, 3, 159, 43, 192, 247, 139, 95, 252, 226, 23, 229, 43, 95, 249, 202, 7, 0, 0, 76, 124, 226, 178, 200, 24, 22, 215, 3, 159, 43, 192, 247, 139, 95, 252, 226, 23, 229, 43, 95, 249, 202, 7, 0, 0, 76, 124, 226, 178,
];

let mut parser = PacketReader::new();

for result in parser.iter_raw_packets(buffer.as_slice()) {
    // Where result: Result<RawPacket, CrsfError>
}

for result in parser.iter_packets(buffer.as_slice()) {
    // Where result: Result<Packet, CrsfError>
}

And push_bytes will still be available for those who want a &RawPacket, or just handle things more manually.

More useful pub trait Payload

I think if we made the Payload trait public, and added to its functionality, it could be alot more useful. It would allow users to obtain a raw packet from a manually constructed packet, or encode into their own buffer much more easily than having to go through the Packet::dump method. This would also avoid using dynamic dispatch.

pub trait Payload
where
    Self: Sized,
{
    const LEN: usize;

    fn packet_type(&self) -> PacketType;

    fn decode(buf: &[u8]) -> Result<Self, Error>;
    // ^-- rename from parse?

    fn encode<'a>(&self, buf: &'a mut [u8]) -> Result<&'a [u8], Error>;
    // ^-- rename from dump?

    fn len(&self) -> usize {
        Self::LEN
    }

    /// Construct a new `RawPacket` from a `Packet`. This adds
    /// the `sync`, `len`, `type` bytes, and calculates and
    /// adds the `crc` byte.
    fn into_raw_packet(&self) -> Result<RawPacket, Error> {
        // This should be fully implemented here in the trait definition
    }
}

We would essentially just have to implement packet_type, decode and encode for some new type, and would then get .into_raw_packet() for free.

tact1m4n3 commented 2 months ago

I think we can make everyone happy with regards to the RawPacket, by having the parser use a RawPacket as a buffer, which is then returned by reference from push_bytes.

:+1:

I also discovered this pattern match which is pretty handy:

impl RawPacket {
    fn payload(&self) -> Result<&[u8], Error> {
        match (self.is_extended(), self.as_slice()) {
            // Skip the [sync], [len], [type], [src], [dst] and [crc] bytes
            (true, [_, _, _, _, _, payload @ .., _]) => Ok(payload),
            // Skip the [sync], [len], [type] and [crc] bytes
            (false, [_, _, _, payload @ .., _]) => Ok(payload),
            _ => Err(Error::BadBufferLen),
        }
    }
}

I also thought of making various methods for the raw packet struct for getting the extended address(if present), type, payload, len etc.

It seems that changing to detecting some other (or even multiple others) sync byte(s) is quite niche. My idea is to allow chainging the behavior of the push_bytes method using a config through push_bytes_with_cfg, where push_bytes just calls with a sensible default configuration. This would allow changing the sync byte to some other value (and whether to validate the type byte?). This means that the parser is only ever capable of looking for one sync byte, but nothing is stopping the user from creating two (or more) parsers, each looking for their own sync byte if needed. On the plus side, this also makes it less likely for any single parser to "mistakenly" start parsing early and thus miss an actual sync byte.

I was thinking about a generic for the sync byte, but your idea seems better, in that it is more flexible. As you said, we can add more options like checking the type etc.

I think the nicest interface for extracting either a RawPacket or Packet from some buffer would be using iterators. This will take and consume + process the entire buffer, returning result as many times as needed with an owned RawPacket or Packet.

:+1:

I think if we made the Payload trait public, and added to its functionality, it could be alot more useful. It would allow users to obtain a raw packet from a manually constructed packet, or encode into their own buffer much more easily than having to go through the Packet::dump method. This would also avoid using dynamic dispatch.

:+1:

tact1m4n3 commented 2 months ago

Let's implement them :))

peterkrull commented 2 months ago

Great! I already have most of these things coded up to make sure it would work, and that the API felt good to use. I just need to clean it up and rebase it, then I will post a PR.

tact1m4n3 commented 2 months ago

Also we can replace Buf with heapless::Vec, as you said previously. Idk if you replaced that already

tact1m4n3 commented 2 months ago

Great! I already have most of these things coded up to make sure it would work, and that the API felt good to use. I just need to clean it up and rebase it, then I will post a PR.

:+1:

peterkrull commented 2 months ago

I have decided heapless::Vec is maybe not the best idea afterall, but I am also not using buf. Just using the RawPacket as the buffer is much simpler. The indexing is simple enough to not require a special type to handle.

anti-social commented 2 months ago

TBH I don't really like the idea of having a push_bytes_with_cfg method which overrides sync byte that was passed into a parser. For me having one place to set sync byte is a more consistent way. Possibly push_bytes could have a mandatory argument so a user always should provide sync bytes that are expected.

anti-social commented 2 months ago

One moment is that following method will make a trait not object safe:

fn decode(buf: &[u8]) -> Result<Self, Error>;

Rust won't allow you to create a &dyn Payload for such a trait.

peterkrull commented 2 months ago

I'm not sure I really understand your concern. Nothing is overriding anything. It is just a matter of using a provided config, or a default config. The way I am thinking of doing it is having a configurable method, which houses the state machine logic:

pub fn push_bytes_with_cfg<'r, 'b>(
    &'r mut self,
    bytes: &'b [u8],
    config: Config,
) -> (Result<&'r RawPacket, CrsfError>, &'b [u8]) {
    // ... state machine logic
}

And a version with a sensible default that just calls the other.

pub fn push_bytes<'r, 'b>(
    &'r mut self,
    bytes: &'b [u8],
) -> (Result<&'r RawPacket, CrsfError>, &'b [u8]) {
    self.push_bytes_with_cfg(
        bytes,
        Config {
            sync: CRSF_SYNC_BYTE,
            type_check: true,
        }, // or just impl Default such that this is just Config::default()
    )
}
peterkrull commented 2 months ago

One moment is that following method will make a trait not object safe:

fn decode(buf: &[u8]) -> Result<Self, Error>;

Rust won't allow you to create a &dyn Payload for such a trait.

I want to forget about making it object safe. This way the trait can contain much more useful behavior. And all it requires is that we discard the as &dyn Payload cast. It will add a few extra lines of code in a few places, but nothing extreme.

peterkrull commented 2 months ago

Another thing about the API. I have been using a custom error which represents "things went well, but parsing is not yet possible" which is returned from the push_bytes method, but maybe this is confusing to some? The alternative would be as has been suggested previously to wrap the result in an option, so to treat this case as "no packet, but no error either => None". That will make the return type quite a bit crazy, but maybe it is still clearer?

pub fn push_bytes<'r, 'b>( &'r mut self, bytes: &'b [u8]) 
-> (Option<Result<&'r RawPacket, CrsfError>>, &'b [u8]) {...}

Edit: I think the Option method makes more sense, even if it is a bit much to look at. I will try and then we can evaluate

anti-social commented 2 months ago

Yep, we definitely should keep Option here

anti-social commented 2 months ago

I only can see one advantage with push_bytes_with_cfg - we can pass references into the method, &[PacketAddress] not making PacketReader generic over any lifetime parameters.

anti-social commented 2 months ago

But a user gets a possibility to call push_bytes_with_cfg with different configs and that doesn't make sense.

peterkrull commented 2 months ago

But a user gets a possibility to call push_bytes_with_cfg with different configs and that doesn't make sense.

That is true, but with the current config it does not really make anything unsound since each check is only relevant for that particular call to push_bytes_with_cfg. We could also just store the config in the reader itself, and provide a method to update it, or get a mutable reference to it?

anti-social commented 2 months ago

I prefer a configuration inside a PacketReader. Why someone may need to update a configuration?

tact1m4n3 commented 2 months ago

Idk but it would make everything more configurable :))

peterkrull commented 2 months ago

I am fine with having it live inside the reader. That would just mean it could work like this:

 let mut reader = PacketReader::new();
 reader.mut_config().sync = PacketAddress::Handset as u8;
peterkrull commented 2 months ago

If we really want to support multiple different sync bytes in the same reader, I take suggestions on how to do so without adding extra lifetimes, weird syntax or exposing the bitflags.

anti-social commented 2 months ago

Bitflags are not exposed publicly. I only don't like that big match when we convert an address to a flag.

anti-social commented 2 months ago

I am fine with having it live inside the reader. That would just mean it could work like this:

 let mut reader = PacketReader::new();
 reader.mut_config().sync = PacketAddress::Handset as u8;

I would suggest to make a configuration via a builder.

peterkrull commented 2 months ago

So like this?

let mut reader = PacketReader::new()
    .set_sync(PacketAddress::Handset as u8);

If we use the bitflag it could be:

let mut reader = PacketReader::new()
    .set_sync(PacketAddress::FlightController, false)
    .set_sync(PacketAddress::Handset, true);

But that is also a bit odd, since then the state of whatever this "sync" is, is quite hidden from the user.

anti-social commented 2 months ago

It's a common pattern of constructing complex objects. Usually it looks like:

let mut reader = PackerReader::builder()
    .addresses(&[PacketAddress::Handset, PacketAddress::Transmitter])
    .check_type(true)
    .build();

In the above example a user is prohibited to modify configuration of PacketReader instance.

Also we can use some configuration that can be passed to one of constructors:

let mut reader = PacketReader::default();

let mut reader = PacketReader::new(
    PacketReaderConfig {
        addresses: &[PacketAddress::Handset, PacketAddress::Transmitter],
        type_check: true,
        ..Default::default()
    }
);
anti-social commented 2 months ago

If we use the bitflag it could be:

let mut reader = PacketReader::new()
.set_sync(PacketAddress::FlightController, false)
.set_sync(PacketAddress::Handset, true);

I don't think a user should know anything about flags and how we store sync bytes internally. We should just pass a slice of addresses.

peterkrull commented 2 months ago

It's a common pattern of constructing complex objects.

Not that the PacketReader can be considered a "complex object" with only 2 configurables, but I get the idea 😅 I'll try to use this pattern