serverlesstechnology / cqrs

A lightweight, opinionated CQRS and event sourcing framework.
Other
346 stars 36 forks source link

Alternative pattern for Event upcasting #46

Closed danieleades closed 4 months ago

danieleades commented 1 year ago

I wanted to present an alternative pattern for event upcasting. This relies on serde's ability to deserialize untagged enum representations.

I've used this pattern before for backwards compatibility of configuration files-

The gist is to separate the internal representation of an event from its serialised representation. It's serialised representation is an untagged union of all historical versions of the Event. You then add an infallible conversion from the union to the current version, and let serde do the rest.

use serde::{Deserialize, Serialize};

mod legacy {
    //! Previous versions of the `Event` enum, for backwards compatibility
    use serde::{Deserialize, Serialize};

    #[derive(Serialize, Deserialize)]
    pub enum V1 {}

    #[derive(Serialize, Deserialize)]
    pub enum V2 {}
}

// This is version 3 of the 'event'
#[derive(Serialize, Deserialize)]
#[serde(from = "EventRep")]
pub enum Event {}

#[derive(Serialize, Deserialize)]
#[serde(untagged)]
enum EventRep {
    V1(legacy::V1),
    V2(legacy::V2),
    V3(Event)
}

impl From<EventRep> for Event {
    fn from(value: EventRep) -> Self {
        match value {
            EventRep::V1(_) => todo!(),
            EventRep::V2(_) => todo!(),
            EventRep::V3(event) => event,
        }
    }
}

For fallible conversions, you could also use #[serde(try_from = "EventRep"].

Implementing upcasting this way simplifies the implementation of the framework, and removes the 'stringly' typed upcasting API in favour of a strongly-typed pattern. The downside is possibly more cognitive load on downstream users to implement this themselves and to get it right.

Obviously this is a breaking change, but i'm interested to get your thoughts.

I'd say it's likely to be possible to simplify some of the boilerplate with a derive macro, if such a thing doesn't already exist in the wild

danieleades commented 1 year ago

this approach would use simple fall-through - serde will attempt to dserialize against each variant in EventRep until one succeeds. That should be ok for most applications, but one obvious optimisation would be use some kind of semver-aware strategy for deserialisation to select the correct variant to deserialise to. A quick search on crates.io shows a few promising approaches

danieleades commented 1 year ago

Small correction - it's not actually a breaking change. It would however make the update API redundant

danieleades commented 1 year ago

here's a slightly more involved version that does away with the 'fall-through' deserialisation by using an internally tagged enum representation-

use serde::{Deserialize, Serialize};

mod legacy {
    //! Previous versions of the `Event` enum, for backwards compatibility
    use serde::{Deserialize, Serialize};

    #[derive(Serialize, Deserialize)]
    pub struct A {
        pub field1: String,
        pub field2: u32,
    }

    #[derive(Serialize, Deserialize)]
    pub enum V1 {
        A(A),
        B,
    }

    #[derive(Serialize, Deserialize)]
    pub enum V2 {
        A,
        B,
        C,
    }
}

// This is version 3 of the 'event'
#[derive(Clone, Serialize, Deserialize)]
#[serde(from = "EventRep", into = "EventRep")]
pub enum Event {}

#[derive(Serialize, Deserialize)]
#[serde(tag = "version", rename_all = "lowercase")]
enum EventRep {
    V1(legacy::V1),
    V2(legacy::V2),
    V3(Event),
}

impl From<EventRep> for Event {
    fn from(value: EventRep) -> Self {
        match value {
            EventRep::V1(_) => todo!(),
            EventRep::V2(_) => todo!(),
            EventRep::V3(event) => event,
        }
    }
}

impl From<Event> for EventRep {
    fn from(event: Event) -> Self {
        EventRep::V3(event)
    }
}

#[cfg(test)]
mod tests {
    use super::Event;

    #[test]
    fn deserialise_v1() {
        let _event: Event = serde_json::from_str(
            r#"
{
    "version": "v1",
    "A": {
        "field1": "Some String",
        "field2": 12
    }
}"#,
        )
        .unwrap();
    }
}

These patterns have some additional properties which are quite nice-

davegarred commented 1 year ago

So you're thinking of using the compiler to understand when to upcast events? I see a couple of potential issues here:

danieleades commented 1 year ago

Let me first preface this by saying i don't unequivocally think you should move to this approach, but i think it's interesting to explore. I can answer most of your questions, but injecting context during upcasting is definitely a wrinkle. With that out of the way-

So you're thinking of using the compiler to understand when to upcast events? I see a couple of potential issues here:

sort of. The upcasting still happens at runtime, but it's handled transparently by the serde framework in a declarative fashion, instead of in custom imperative code.

  • the upcast may not be something that the compiler can identify

Can you give an example of what you mean here?

  • initial state information many times needs to be added (setting a new 'country' value to 'USA')

i don't think there's anything stopping you from doing that with the approach of outlined, assuming for a given case it was always 'USA'.

an example from the code-

        let upcast_function = Box::new(|payload: Value| {
            if let Value::Object(mut object_map) = payload {
                object_map.insert("country".to_string(), "USA".into());
                Value::Object(object_map)
            } else {
                panic!("the event payload is not an object")
            }
        });
        let upcaster = SemanticVersionEventUpcaster::new("EventX", "2.3.4", upcast_function);

this would become

struct EventV1 {
    zip_code: usize,
    state: String,
}

struct Event {
    zip_code: usize,
    state: String,
    country: String,
}

impl From<EventV1> for Event {
    fn from(event: EventV1) -> Self {
        Self {
            zip_code: event.zip_code,
            state: event.state,
            country: "USA".to_string(),
        }
}

Where it gets tricky is if sometimes its 'USA' and sometimes you need to inject something else. The current upcasting implementation could track this with internal state, whereas the new one has no internal state. I can't see any examples of this in the code currently by the way, but let's assume for the sake of argument there are cases where you want this.

That too would be solveable with an approach which is similar to what i've implemented in this PR, though with slightly more boilerplate. You could do something like-

struct EventV1 {
    zip_code: usize,
    state: String,
}

struct EventV2 {
    zip_code: usize,
    state: String,
    country: String,
}

/// The serialised representation of the event, supports multiple versions
enum EventRep {
    V1(EventV1)
    V2(EventV2)
}

trait Upcast {
    /// inject arbitrary context during upcasting
    type: Context;
    /// the current version of your event
    type: Target;

    fn upcast(self, context: &Self::Context) -> Self::Target;
}

impl Upcast for EventRep {
    type: Context = String;
    type: Target = EventV2;

    fn upcast(self, context: String) -> EventV2 {
        match self {
            Self::V1(event_v1) => {
               EventV2 {
                   zip_code: event_v1.zip_code,
                   state: event_v1.state,
                   country: context,
            },
            Self::V2(event_v2) => event_v2,
        }
    }
}

I guess this is something of a halfway house between the two approaches. You gain the ability to inject arbitrary context, everything is still strongly-typed, and you still don't need to juggle callbacks.

  • chained upcasters are common (e.g., v1 ==> v3 ==> v8 ==> current)

nothing in the From/TryFrom approach stops you from chaining conversions.

Say you have versions 1, 2, and 3. You can implement conversions from 1 -> 2 and 2 -> 3. You can then add a direct conversion from 1 -> 3 which internally delegates to the other two conversions in turn.

There's a helper macro for this actually - https://github.com/bobozaur/transitive

  • could add a performance penalty over simple version comparator (granted, this might be negligible with Rust)

i don't think so, i suspect it might actually be a bit faster, since you're not serialising into an intermediate JSON object. This has the side effect of only employing strongly-typed structures everywhere