rust-transit / gtfs-structure

Read a GTFS file
MIT License
57 stars 32 forks source link

Support for optional extra fields #77

Open irexiz opened 3 years ago

irexiz commented 3 years ago

Hi, I've forked the repository as I needed to add support for additional optional fields that are not in the GTFS standard. I've done this because a number of cities provide additional information regarding their public transit, such as "brigade_id" (a brigade refers to the next bus in the current line/route), among other useful information.

Would you be willing to merge such a feature to this crate? It would be something along these lines:

#[derive(Debug, Serialize, Deserialize, Default)]
pub struct RawTrip {
    #[serde(rename = "trip_id")]
    pub id: String,
   // snip!
    #[serde(flatten)]
    pub extra: HashMap<String, Value>,
}

Where Value is a catch all wrapper for deserialization

#[derive(Serialize, Deserialize, Debug, PartialEq, Clone)]
#[serde(rename_all = "snake_case")]
#[serde(untagged)]
pub enum Value {
    Bool(bool),
    U64(u64),
    I64(i64),
    Float(f64),
    String(String),
}
antoine-de commented 3 years ago

sorry for the delay, I don't know how I missed this issue :confused:

hum I'm not completely convinced as adding this will add an overhead for all (and on trips, where there can be quite a big number of them) and miss a bit the point of having real types for the GTFS. However I understand why you need this, and I'm not sure the performance issues will really bother some people, but I'd vote to hide this behind a feature flag.

irexiz commented 3 years ago

Hi @antoine-de

I've kind of got impatient and ended up writing another library based off of this one (I hope you don't mind) as I needed that feature internally.

https://github.com/irexiz/gtfs-parser/tree/master

However, I decided against the idea I mentioned in the issue, due to a quirky behavior in serde-csv (https://github.com/BurntSushi/rust-csv/issues/151) where deserializing a String "010" ends up being interpreted as an integer (10).

I've opted into a different solution (readme or docs if I can upload it to crates.io), as well as handled most of datasets defined in the GTFS reference.

On top of that, I only read the necessary files instead of everything at once :)

Edit: A colleague of mine is asking whether or not this should be eventually a pullrequest to this repository as a new version. What do you think?