squidpickles / ais

An AIS parser written in Rust
Apache License 2.0
15 stars 9 forks source link

Allow use of structs in other projects #10

Closed keesverruijt closed 1 year ago

keesverruijt commented 1 year ago

For my own project I need to receive and parse AIS messages.

As I want to store AIS static data so that when the program restarts it has a ready database of vessel names etc, I need to serialise this data. Due to how Rust works I needed the types struct to be pub.

In due course I also found a lifetime issue in two files that no longer compile under Rust 1.67 with updated crates.

squidpickles commented 1 year ago

Hi @keesverruijt , thank you for the PR. Glad to see this has been helpful.

No problem making mod type pub, let's do that.

Interesting about the lifetimes. I see that too, updating dependencies, so that looks good.

On the serde support, I'd like to talk more about that one. First point is if we're going to add support for some of the data types, I'd prefer to add it throughout the crate. Second, I'd prefer to add that as another feature, rather than included by default with std. But lastly, I think it might take a fair bit of finessing to make a decent output format. The enum variants and struct names are quite verbose, to make all of this easy to understand when working with it in code, but it would be a pretty messy looking serialized file. With some thought and consistency about how to tag enums, represent structs, and so on, it could be useful to have a data format that's more human readable, but I don't think I'd just want to slap a #[derive(Serialize)]. I'm open to discussion on this, though.

If you want to split the serde stuff out into a separate PR, I'd be happy to merge the module visibility and lifetime change right away.

Thanks again!

keesverruijt commented 1 year ago

Hi Kevin,

Super.

I will split the PR into three so it is visible later on why these changes were made.

I agree on serde, as you say the default serialization is quite verbose for types. In fact, in my code I would like to have access to the raw original shiptype number instead of an enumeration, to pass this on to software downstream and possibly serialise as well. Do you have any idea on how to accomplish that in a Rust-like manner? (I'm on my 3rd week of Rust...)

keesverruijt commented 1 year ago

Before we make the ship type public I think we should agree on what form the various *Type enums should take. I came across this crate: https://docs.rs/int-enum/latest/int_enum/trait.IntEnum.html. Another approach would be:

#[repr(u8)]
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
enum ShipType {
     ...
     pub enum ShipType {
     Reserved_1 = 1,
     ...
     WingInGround = 20,
     WingInGroundHazardousCategoryA = 21,
     ...
}

fn to_u8(ship_type: &ShipType) -> u8 {
    *ship_type as u8
}

As you can see I don't see how to handle the Reserved and *Reserved types that represent multiple numeric ship types in a nice way yet. I think this is only a problem for ShipType, the other types have unique useful values.

An alternative would be to make ShipType a struct like

pub struct ShipType 
{
   raw: u8,
}

impl ShipType {
   pub fn to_u8(ship_type: &ShipType) -> u8 {
      ship_type.raw
   }
   pub fn to_str(ship_type: &ShipType) -> &'static str {
       match ship_type.raw {
            20 => "WingInGround"
            ...
       }
}

(and implement the relevant trait impls for Debug, print etc.)

I think we can still change this because ShipType is not yet public?

squidpickles commented 1 year ago

Originally, I didn't write this with the thought of a data round trip, so some of the parsing is lossy. I am rethinking some of that after seeing your suggestions.

In the future, this crate may be useful for producing AIS messages as well as parsing them. At that point, having a canonical numerical representation that matches the AIS spec makes a ton of sense. For now, though, I think it'd be a pretty big leap to change the internal representations of everything.

I'd instead suggest keeping it more in line with the other data types, and only storing the original value where needed for disambiguation. So -

pub enum ShipType {
    Reserved(u8),
    WingInGround,
    WingInGroundHazardousCategoryA,
    WingInGroundHazardousCategoryB,
    WingInGroundHazardousCategoryC,
    WingInGroundHazardousCategoryD,
    WingInGroundReserved(u8),
    ...
    HighSpeedCraftReserved(u8),
    OtherReserved(u8),
    OtherNoAdditionalInformation,
}

and storing the value where needed:

    pub fn parse(data: u8) -> Option<Self> {
        match data {
            0 => None,
            1..=19 => Some(Self::Reserved(data)),
            20 => Some(Self::WingInGround),
            21 => Some(Self::WingInGroundHazardousCategoryA),
            22 => Some(Self::WingInGroundHazardousCategoryB),
            23 => Some(Self::WingInGroundHazardousCategoryC),
            24 => Some(Self::WingInGroundHazardousCategoryD),
            25..=29 => Some(Self::WingInGroundReserved(data)),
            ...
            95..=98 => Some(Self::OtherReserved(data)),
            99 => Some(Self::OtherNoAdditionalInformation),
            100..=u8::MAX => None,
        }
    }

That means ShipType still acts as a normal enum, but ambiguous values can be destructured as needed.

(Note, this keeps >99 as None; could return that to Unknown(u8), but I believe those should be interpreted as undefined)

I put an example of this on a branch; see https://github.com/squidpickles/ais/compare/main...pub-disambiguated-types

keesverruijt commented 1 year ago

I'm happy with that. I don't mind losing the numbers > 99. To add to that we can add to/from u8 and to/from string and serialize as u8? I'm happy to work on my Rust skills and send in a pull request.

squidpickles commented 1 year ago

I'm happy with that. I don't mind losing the numbers > 99. To add to that we can add to/from u8 and to/from string and serialize as u8? I'm happy to work on my Rust skills and send in a pull request.

Great, glad that will work.

For the type conversions, I don't want to commit to having integer representations of everything without thinking it through. I was originally kind of trying to get away from the numbers, and have code work with the symbolic types instead. It's possible this is all a bad idea, and we should really just have small int representations of everything; that would make a serialization format that looks a lot like AIS itself, but expanded into whole bytes per value. That might make sense, but again, I think it would take a little more thought.

Despite that, let's start with ShipType. Your use case would be helped out, I don't have any major objections, and it's already an 8-bit number in AIS. If you implement From<ShipType> for u8, the reverse could still be handled by ShipType::parse(u8). (It looks somewhat like a TryFrom, but of course returns an Option rather than an error.)

Strings, I'd rather not do that for this round. The names in code are mostly not abbreviated, and are readable, but I think a string representation mostly makes sense for presenting to users, and that should be localized and perhaps application specific.

keesverruijt commented 1 year ago

This is now no longer desired, so let's close.