arcane-rs / arcane

1 stars 1 forks source link

Way to 0.1 #1

Open tyranron opened 3 years ago

tyranron commented 3 years ago

Project layout

Roadmap

ilslv commented 3 years ago

Dealing with EventVersion

First of all, we should understand, that EventVersion is creating troubles only when Events are Deserialized. And as client's data can have any format, generalising behaviour is quite hard.

I propose quite barebones solution, but it keeps maximum amount of flexibility

// arcana

trait DeserializeEvent<'de> {
    fn deserialize_event<D>(
        name: EventName, 
        ver: EventVersion, 
        deserializer: D,
    ) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>;
}

impl<Ev> DeserializeEvent<'de> for Ev
where 
    Ev: VersionedEvent + serde::Deserialize
{
    // ...
}

struct DeserializeEventSeed<Ev> {
    pub name: EventName,
    pub ver: EventVersion,
    _event: PhantomData<Ev>,
}

impl<'de, 'a, Ev> DeserializeSeed<'de> for DeserializeEventSeed<Ev>
where
    Ev: DeserializeEvent<'de>,
{
    type Value = Ev;

    fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
    where
        D: Deserializer<'de>,
    {
        Ev::deserialize_event(self.name, self.ver)
    }
}

// client code

#[derive(serde::Deserialize, VersionedEvent)]
#[event(name = "chat", version = 2)]
struct ChatEvent {
    id: String,
}

#[derive(serde::Deserialize, VersionedEvent)]
#[event(name = "file", version = 2)]
struct FileEvent {
    id: String,
}

#[derive(DeserializeEvent, Event)]
enum Event {
    #[event(deserialize(from = v1::ChatEvent))]
    Chat(v2::ChatEvent),
    FileV2(FileEvent),
    FileV1(v1::FileEvent),
}

mod v1 {
    #[derive(serde::Deserialize, VersionedEvent)]
    #[event(name = "chat", version = 1)]
    struct ChatEvent {
        id: u16,
    }   

    impl From<ChatEvent> for super::ChatEvent {
        fn from(ev: ChatEvent) -> Self {
            Self {
                id: ev.id.to_string(),
            }
        }
    }

    #[derive(serde::Deserialize, VersionedEvent)]
    #[event(name = "file", version = 1)]
    struct FileEvent {
        file: Vec<u8>,
    }
}

DeserializeEvent trait

Main reason to introduce that trait is that serde traverses Deserialized data only once, which can introduce unnecessary overhead.

For example

#[derive(Serialize, Deserialize)]
#[serde(tag = "type")]
enum Message {
    Request { id: String, method: String, params: Params },
    Response { id: String, result: Value },
}

// JSON representation
// {"type": "Request", "id": "...", "method": "...", "params": {...}}

Until serde encounters type field, it collects everything else inside serde_value::Value-like struct. That introduces dynamic allocations for inner Box<Value>, which we might want to avoid.

DeserializeEvent derive-macro

There are 2 different possibilities for Events with same EventName but different EventVersions:

  1. There is From<V1> for V2 implementation In that case we simply indicate this relation with #[event(deserialize(from(...)))] attribute
#[derive(DeserializeEvent, Event)]
enum Event {
    #[event(deserialize(from = v1::ChatEvent))]
    Chat(ChatEvent),
    // ...
}
  1. There is no From<V1> for V2 implementation In that case we can't really be sure, what user wants to happend: call another EventSourced impl, fallible conversion into another type, or simply return an error. All those possibilities are viable solutions. So we just Deserialize them in different enum variants for handling
#[derive(DeserializeEvent, Event)]
enum Event {
    // ...
    FileV2(FileEvent),
    FileV1(v1::FileEvent)
}

ack @tyranron

tyranron commented 3 years ago

@ilslv regarding serialization/deserialization I'd like to temporary keep that story aside from this project. Ideally, we shouldn't dictate this to library users at all. So implementing serialize/deserialize is totally their responsibility, not ours.

The question which needs discussion and investigation here is not about serialization/deserialization, but rather about Events evolving. Imagine that we have a large project which changes over time. Some new events appear (that's trivial), some disappear, some change either in breaking manner or not breaking manner. How should we handle all of that?

Ideally, we don't want new events to wear the burden of its predecessors, and we want clear ways to avoid outdated versions and work only with new ones.

For example, we had this initially:

#[derive(event::Versioned)]
#[event(name = "email.added", version = 1)]
struct EmailAddedV1 {
    email: String,
}

And later we evolve to something like that:

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    email: String,
    by: UserId,
}

And we have dilemma here:

Another question to investigate is how to better keep outdated events (modules layout, etc).

ilslv commented 3 years ago

Evolving schema

1. Extending an already existing event

Most of the time in this case we'll just add fields to some event.

There are 3 different ways of dealing with this situation:

  1. Creating From implementation Pros: only 1 EventSourced implementation needed Cons: events in the future can have a lot of Option fields to be able to transform from old ones

  2. Dealing with them as separate events Pros: stricter events definitions, which makes them harder to misuse Cons: having different EventSourced implementations, which makes it harder to understand what's really happening in the system

  3. Uniting events of different versions in enums This approach is a combination of previous 2. We let developers to decide, whether they want to add new strict variant without any Option fields, or they want to replace old variant with the less strict one. Pros: only 1 EventSourced implementation needed, refactoring-friendly Cons: less intuitive, may lead to more boilerplate (should be investigated)

/// Old event
#[derive(event::Versioned)]
#[event(name = "email.added", version = 1)]
struct EmailAddedV1 {
    email: String,
}

// 1. Creating `From` implementation

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    email: String,
    confirmed_by: Option<UserId>,
}

impl From<EmailAddedV1> for EmailAddedV2 {
    // ...
}

impl Sourced<EmailAddedV2> for S {
    // ...
}

// How it may look in the future

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV10 {
    email: String,
    confirmed_by: Option<UserId>,
    a: Option<A>,
    lot: Option<Lot>,
    of: Option<Of>,
    optional: Option<Optional>,
    fields: Option<Fields>,
}

impl From<EmailAddedV1> for EmailAddedV10 {
    // ...
}

// ...

impl From<EmailAddedV9> for EmailAddedV10 {
    // ...
}

impl Sourced<EmailAddedV10> for S {
    // ...
}

// 2. Creating `From` implementation

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    email: String,
    confirmed_by: UserId,
}

impl Sourced<EmailAddedV1> for S {
    // ...
}

impl Sourced<EmailAddedV2> for S {
    // ...
}

// How it may look in the future

impl Sourced<EmailAddedV4> for S {
    // ...
}

// ...

impl Sourced<EmailAddedV10> for S {
    // ...
}

// 3. Uniting events of different versions in enums

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    email: String,
    confirmed_by: UserId,
}

enum EmailAdded {
     V1(EmailAddedV1)
     V2(EmailAddedV2)
}

impl Sourced<EmailAdded> for S {
    // ...
}

// How it may look in the future

#[derive(event::Versioned)]
#[event(name = "email.added", version = 9)]
struct EmailAddedLegacy {
    email: String,
    confirmed_by: Option<UserId>,
    a: Option<A>,
    lot: Option<Lot>,
    of: Option<Of>,
    optional: Option<Optional>,
    fields: Option<Fields>,
}

#[derive(event::Versioned)]
#[event(name = "email.added", version = 10)]
struct EmailAddedV10 {
    email: String,
    confirmed_by: Option<UserId>,
    much: Much,
    stricter: Stricter,
    definition: Definition,
}

enum EmailAdded {
    Legacy(EmailAddedLegacy), // Converted from versions 1-9
    V10(EmailAddedV10), 
}

impl Sourced<EmailAdded> for S {
    // ...
}

2. Renaming/removing event's fields

  1. Deserialization-based only
#[derive(event::Versioned)]
#[event(name = "email.added", version = 1)] // Version didn't change
struct EmailAddedV2 {
    #[event(alias(value))]
    email: String,
}
  1. Proc-macro + deserialization based approach
#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    #[event(alias(value, version = 1))]
    email: String,
}

// May be expanded to different structs or remain single with version validation on deserialization

Both 1 and 2 are requiring to enforce our own deserialization onto developer. I don't consider that as much of a problem, as serde is de-facto standart in rust ecosystem.

  1. Different version with From impl
#[derive(event::Versioned)]
#[event(name = "email.added", version = 1)]
struct EmailAddedV1 {
    value: String,
}

#[derive(event::Versioned)]
#[event(name = "email.added", version = 2)]
struct EmailAddedV2 {
    email: String,
}

impl From<EmailAddedV1> for EmailAddedV2 {
    // ...
}

/// May be combined with `3. Uniting events of different versions in enums` from previous step

I lean more to this option, as renaming fields should be quite infrequent usecase

3. Ignore entire event

  1. Introduce some middleware for filtering deserialized events
+---------+    +---------+
|         |    |         |
|  Event  +---->  Event  |
| Storage +----> Adapter +-->
|         |    |         |
+---------+    +---------+
  1. Use deserializer as filter This can allow to squeeze some performance by not deserializing some fields This option worth mentioning, but I don't think that this is a viable sotion, as it makes harder to understand the code and perfomance benefit is neglegable
+---------+    +--------------+   +---------+
|         |    |              |   |         |
|  Event  +---->              |   |  Event  |
| Storage +----> Deserializer +---> Adapter +--->
|         |    |              |   |         |
+---------+    +--------------+   +---------+

4. Split large event

+---------+     +---------+
|         |     |         |
|  Event  |     |  Event  +--n-->
| Storage +--1--> Adapter +--n-->
|         |     |         |
+---------+     +---------+

5. Transforming events based on some Context

We can't gurantee that it would be possible to deterministically transform old event into a new one (althought it should be the last resort), so Event Adepter should have some Context to work with. This Context may accumalate some events, transform them, but it's logic still has to be as small as possible. But I do think that those transformations must be infallible.

Proposal

For solving first 2 problems I propose combination of 1.3 and 2.3. This should cover us for most use-cases.

// Declarations of events 1-8 with `Deserialize` and `From` impls for `EmailAddedLegacy`

#[derive(event::Versioned, Deserialize)]
#[event(name = "email.added", version = 9)]
struct EmailAddedLegacy {
    email: String,
    confirmed_by: Option<UserId>,
    a: Option<A>,
    lot: Option<Lot>,
    of: Option<Of>,
    optional: Option<Optional>,
    fields: Option<Fields>,
}

#[derive(event::Versioned, Deserialize)]
#[event(name = "email.added", version = 10)]
struct EmailAddedV10 {
    email: String,
    confirmed_by: Option<UserId>,
    much: Much,
    stricter: Stricter,
    definition: Definition,
}

#[derive(Event, Deserialize)]
enum EmailAdded {
    Legacy(EmailAddedLegacy), // Converted from versions 1-9
    V10(EmailAddedV10), 
}

impl Sourced<EmailAdded> for S {
    // ...
}

Regarding problems 3-5 it looks like we sould add a new abstraction layer between event storage and EventSourced logic. I'll investigate ergonomic and easy-to-use abstraction for it.

Unresolved questions

Should we consider blue-green deployment where some instances of the same service are producing old events, when other instances already were upgraded?

ack @tyranron

ilslv commented 3 years ago

Discussed:

Should we consider blue-green deployment where some instances of the same service are producing old events, when other instances already were upgraded?

Backwards-compatibility is preserved (when old versions of the events are stored for a short amount of time), while forward-compatibility is not, which resolves in 500 for a short time, until instance in updated.

Event Adapter

Sounds like the way to go

ilslv commented 3 years ago

First draft of EventAdapter

Base trait

trait EventTransformer<Event> {
    type Context: ?Sized;
    type Error;
    type TransformedEvent;
    type TransformedEventStream<'ctx>: Stream<Item = Result<
        Self::TransformedEvent,
        Self::Error,
    > + 'ctx;

    fn transform(
        event: Event,
        context: &mut Self::Context,
    ) -> Self::TransformedEventStream<'_>;
}

EventTransformer implemented for some Adapter struct, generalised by Event, so different Adapters can transform same Events differently.

Design decisions

  1. &mut Context As we want to preserve events order, we shouldn't process them concurrently. This allows us to guarantee exclusive access to Context. Downside: This design wouldn't allow to use something like buffered adapter

  2. No &self or &mut self I don't really see, how reference to Self would be useful, as we can encapsulate all dependencies in &mut Context. But that can be easily added.

Alternatives

Replace &mut Context with &mut self and keep all context inside Self. Downside: dependency injection becomes really hard.

Convenience traits

trait EventTransformStrategy<Event> {
    type Strategy;
}

To avoid implementing everything by hand we would provide some convenience Strategies

impl EventTransformStrategy<SkippedEvent> for Adapter {
    type Strategy = strategy::Skip;
}

strategy::Skip allows to skip entire events.

impl EventTransformStrategy<EmailConfirmed> for Adapter {
    type Strategy = strategy::AsIs;
}

strategy::AsIs just passes event as is.

impl EventTransformStrategy<EmailAdded> for Adapter {
    type Strategy = strategy::Into<EmailAddedOrConfirmed>;
}

strategy::Into uses impl From<EmailAdded> for EmailAddedOrConfirmed to convert events.

impl EventTransformStrategy<EmailAddedAndConfirmed> for Adapter {
    type Strategy = strategy::Split<EmailAddedOrConfirmed, 2>;
}

impl From<EmailAddedAndConfirmed> for [EmailAddedOrConfirmed; 2] {
    fn from(ev: EmailAddedAndConfirmed) -> Self {
        [
            EmailAdded { email: ev.email }.into(),
            EmailConfirmed {
                confirmed_by: ev.confirmed_by,
            }
            .into(),
        ]
    }
}

strategy::Split allows to convert into several events at once.

These are just examples and we can provide many more Strategies to simplify our life.

Besides that, we didn't loose ability to implement EventTransformer manually.

impl EventTransformer<Custom> for Adapter {
    type Context = dyn Any;
    type Error = Infallible;
    type TransformedEvent = EmailAddedOrConfirmed;
    type TransformedEventStream<'ctx> = stream::Empty<Result<EmailAddedOrConfirmed, Infallible>>;

    fn transform(
        _: Custom,
        _: &mut Self::Context,
    ) -> Self::TransformedEventStream<'_> {
        stream::empty()
    }
}

That impl basically is the same as strategy::Skipped.

trait EventAdapter<Events> {
    type Context: ?Sized;
    type Error;
    type TransformedEvents;
    type TransformedEventsStream<'ctx>: Stream<Item = Result<Self::TransformedEvents, Self::Error>>
        + 'ctx;

    fn transform_all(
        events: Events,
        context: &mut Self::Context,
    ) -> Self::TransformedEventsStream<'_>;
}

impl<Adapter, Events> EventAdapter<Events> for Adapter
where
    Events: Stream + 'static,
    Adapter: EventTransformer<Events::Item> + 'static,
    Adapter::Context: 'static,
{
    type Context = Adapter::Context;
    type Error = Adapter::Error;
    type TransformedEvents = Adapter::TransformedEvent;
    type TransformedEventsStream<'ctx> = AdapterStream<'ctx, Adapter, Events>;

    fn transform_all(
        events: Events,
        context: &mut Self::Context,
    ) -> Self::TransformedEventsStream<'_> {
        AdapterStream::new(events, context)
    }
}

This trait comes with a blanket impl for any compatible type, implementing EventTransformer and allows to transform Stream of incoming events and Context into a transformed Stream. GATs allow to do it without any unnecessary dynamic allocations required for type erasure.

Whole implementation flow

// Declare all possible input events

#[derive(Debug)]
struct SkippedEvent;

#[derive(Debug)]
struct EmailAddedAndConfirmed {
    email: String,
    confirmed_by: String,
}

#[derive(Debug)]
struct EmailAdded {
    email: String,
}

#[derive(Debug)]
struct EmailConfirmed {
    confirmed_by: String,
}

// Unite them in a enum, deriving `EventTransformer`

#[derive(Debug, From, EventTransformer)]
#[event(transform(into = EmailAddedOrConfirmed, context = dyn Any))]
enum InputEmailEvents {
    Skipped(SkippedEvent),
    AddedAndConfirmed(EmailAddedAndConfirmed),
    Added(EmailAdded),
    Confirmed(EmailConfirmed),
}

// Declare enum of output events

#[derive(Debug, From)]
enum EmailAddedOrConfirmed {
    Added(EmailAdded),
    Confirmed(EmailConfirmed),
}

// Implement transformations

struct Adapter;

impl EventTransformStrategy<EmailAdded> for Adapter {
    type Strategy = strategy::AsIs;
}

impl EventTransformStrategy<EmailConfirmed> for Adapter {
    type Strategy = strategy::Into<EmailAddedOrConfirmed>;
}

impl EventTransformStrategy<EmailAddedAndConfirmed> for Adapter {
    type Strategy = strategy::Split<EmailAddedOrConfirmed, 2>;
}

impl From<EmailAddedAndConfirmed> for [EmailAddedOrConfirmed; 2] {
    fn from(ev: EmailAddedAndConfirmed) -> Self {
        [
            EmailAdded { email: ev.email }.into(),
            EmailConfirmed {
                confirmed_by: ev.confirmed_by,
            }
            .into(),
        ]
    }
}

impl EventTransformStrategy<SkippedEvent> for Adapter {
    type Strategy = strategy::Skip;
}

// Test Adapter

#[tokio::main]
async fn main() {
    let mut ctx = 1_usize; // Can be any type
    let events = stream::iter::<[InputEmailEvents; 4]>([
        EmailConfirmed {
            confirmed_by: "1".to_string(),
        }
        .into(),
        EmailAdded {
            email: "2".to_string(),
        }
        .into(),
        EmailAddedAndConfirmed {
            email: "3".to_string(),
            confirmed_by: "3".to_string(),
        }
        .into(),
        SkippedEvent.into(),
    ]);

    let collect = Adapter::transform_all(events, &mut ctx)
        .collect::<Vec<_>>()
        .await;

    println!("context: {}\nevents:{:?}", ctx, collect);
    // context: 1,
    // events: [
    //     Ok(Confirmed(EmailConfirmed { confirmed_by: "1" })), 
    //     Ok(Added(EmailAdded { email: "2" })), 
    //     Ok(Added(EmailAdded { email: "3" })), 
    //     Ok(Confirmed(EmailConfirmed { confirmed_by: "3" }))
    // ]
}

Complete example

Downsides

To implement EventApdater trait, we use custom Stream with 1 line of unsafe code, as I couldn't figure out the way to do it safely. Alternative is to use &Context everywhere, which will allow to use buffered adapter and provide safe impl for trait.

ack @tyranron

tyranron commented 3 years ago

@ilslv

EventTransformer trait

  1. &mut Context As we want to preserve events order, we shouldn't process them concurrently. This allows us to guarantee exclusive access to Context. Downside: This design wouldn't allow to use something like buffered adapter

Unsure about &mut. buffered things still allow to preserver order, while process stuff concurrently. Using interior mutability for contexts is a common thing.

  1. No &self or &mut self I don't really see, how reference to Self would be useful, as we can encapsulate all dependencies in &mut Context. But that can be easily added.

One major argument for using &self is trait object safety, so someone will be able to use opaque dyn EventTranformers.

Another one is that dependcies vary, and it still may be meaningful to keep some of them in Adaptor rather than in context.

EventTransformStrategy trait

strategy::Split allows to convert into several events at once.

It seems that HList might have a better fit there rather than an array.


Okay, let's start with implementing that as a "step 1" for events adapting story. Along with an implementation, please, provide full set of examples to cover all possiblr situations, so we'll see how it plays and will evolve in future.

Additional typle-level restrictions to ensure more variants being met let's do in a "step 2" after merging "step 1".

ilslv commented 3 years ago

@tyranron

Unsure about &mut. buffered things still allow to preserver order, while process stuff concurrently. Using interior mutability for contexts is a common thing.

Agreed, especially that in practice Context will hold some reference to DbPool, which already provide interior mutability. I've implemented with &mut to demonstrate hardest constraints

One major argument for using &self is trait object safety, so someone will be able to use opaque dyn EventTranformers.

Very good point, entirely missed it

It seems that HList might have a better fit there rather than an array.

It's just PoC, of course we should provide some mechanism, that will allow to vary number of emitted elements based on content of input event, while array approach doesn't

ilslv commented 2 years ago

@tyranron regarding our discussion how tracing::Span::current() works and can it be used for Context, I've made a PoC recreating basic capabilities. I guess with a bit more time I can remove redundant clones.