Avoiding reactor side-effects (and place to ask questions?)

envato / event_sourcery

A library for building event sourced applications in Ruby

MIT License

84 stars 10 forks source link

Avoiding reactor side-effects (and place to ask questions?) #183

Open wwahammy opened 7 years ago

wwahammy commented 7 years ago

(PS: Is there a better place to ask questions about event sourcery?)

I'm admittedly new to Event Sourcing and CQRS so if the questions don't make sense I apologize in advance.

I'm trying to wrap my head around how event_sourcery would work in practice. My big question relates to reactor side-effects. If I understand event sourcing properly, it should be possible (and sometimes necessary) to replay events. This could happen when you have to reset an aggregate database for example.

I also understand that reactors are one of the few places you should have a side effect. The questions is what happens when you have to recreate your aggregates? Does the side effect re-run? As an example looking at the todo example app: https://github.com/envato/event_sourcery_todo_app/blob/master/app/reactors/todo_completed_notifier.rb, the reactor will email on todo completion. Will that happen again if I need to regenerate the aggregate? Or am I totally misunderstanding how this all works?

stevehodgkiss commented 7 years ago

Hi @ericschultz

Yes, this is the best place to ask questions about Event Sourcery right now.

Aggregates replay events when they're loaded from the event store/repository to reconstitute current state. This should have no side effects (it would if there were side effects in the apply MyEvent do blocks).

Projectors shouldn't have side effects. If the projection has been modified typically a new one is deployed that runs in parallel with the old one until it's caught up and can replace the queries going to the old projection.

Reactors have side effects and can have an internal projection. There are times when you might want to modify that projection in a way that it needs recreating, and there isn't an easy way to do that right now.

it should be possible (and sometimes necessary) to replay events. This could happen when you have to reset an aggregate database for example.

I'm not sure what you mean by reset an aggregate database, but replaying events is something happens in an aggregate each time it's loaded from the repository/event store.

The questions is what happens when you have to recreate your aggregates? Does the side effect re-run? Will that happen again if I need to regenerate the aggregate? Or am I totally misunderstanding how this all works?

There should be no side effects when an aggregate is loaded from the event store. The side effect happens in the reactor. It would only redo the side effect if it replayed the same event, for example if the reactor started again from event sequence ID 0.

Here's a diagram showing the structure/concepts in event sourcery. Everything outside of the ESPRunner box usually runs in your typical unicorn/puma processes. Projectors & Reactors each run in their own process (which is forked from a master process if ESPRunner is used).

concepts

wwahammy commented 7 years ago

Thanks Steve! I appreciate the help and I think I'm understanding things a bit better. This answers a lot of my questions but brings up a few more.

Reactors have side effects and can have an internal projection. There are times when you might want to modify that projection in a way that it needs recreating, and there isn't an easy way to do that right now.

What are the situations where a reactor would have an internal projection? And how would one avoid (or is it even possible?)

The side effect happens in the reactor. It would only redo the side effect if it replayed the same event, for example if the reactor started again from event sequence ID 0.

Is there a way to make sure this doesn't happen? Or what are the nuances that need to be understood about that to prevent the side effects from happening in that case?

Thanks a ton, Event Sourcery looks like a great tool!

grassdog commented 7 years ago

👋 @ericschultz

What are the situations where a reactor would have an internal projection? And how would one avoid (or is it even possible?)

Sometimes a reactor needs to keep track of state so it knows how to react to an event (we generally only process one event at a time). For example a reactor may have logic that says it needs to see a subscription_requested and subscription_accepted event against a person before sending off a welcome email to that person. In this case it might keep a table where it tracks which of these events it's seen per person.

As for preventing side effects happening twice the general approach we recommend is that a reactor emit an event in the store to indicate that its done the work. That way if you need to rerun the reactor for some reason you can check for these events before doing the work. Using the example above the reactor might emit a welcome_email_sent event after it sends the email. If you were to replay events from the beginning for the reactor you could make sure it didn't send an email to a person if the welcome_email_sent event was already present in the store.

Does that help answer your questions?

wwahammy commented 7 years ago

Thanks @grassdog and @stevehodgkiss. I think I'm wrapping my head around this some. One other topic I'm struggling with is thinking of how to handle relations between aggregates. For example, let's say a single user can have multiple email addresses. In the traditional relational world, that'd be a straightforward one-to-many relation with a user table and an email table with a foreign key to the user table. How is that structured using EventSourcery to make sure that changes to emails and users represent consistent data?

(As an aside, this would be super helpful to illustrate in the todolist demo app)

twe4ked commented 7 years ago

Hey @ericschultz.

Consider the following stream of events:

UserSignedUp.new(aggregate_id: user_1, name: 'Alice')
EmailAdded.new(aggregate_id: email_1, email: 'alice@example.com', user_id: user_1)
EmailAdded.new(aggregate_id: email_1, email: 'alice@other.example.com', user_id: user_1)

If you wanted a projection that would that you could query for a user and list their emails you could build the following:

class UserEmailProjection
  include EventSourcery::Postgres::Projector

  projector_name :user_email_projector

  # NOTE: These `table` definition helpers come from event_sourcery-postgres
  table :user_email_projector_users do
    column :user_id, 'UUID', null: false
    column :name, :text
  end

  table :user_email_projector_emails do
    column :user_id, 'UUID', null: false
    column :email, :text
  end

  project UserSignedUp do |event|
    table(:user_email_projector_users).insert(
      user_id: event.aggregate_id,
      name: event.body[:name],
    )
  end

  project EmailAdded do |event|
    table(:user_email_projector_emails).insert(
      user_id: event.body[:user_id],
      name: event.body[:name],
    )
  end
end

You could then query the table:

class UserEmailsModel
  self.find_emails(name:)
    table(:user_email_projector_emails)
      .select(:email)
      .join(:user_email_projector_users, user_id: :user_id)
      .where(name: name)
  end
end

Consider the code semi-psudo-code, this is completely untested and some of the syntax may be off.

wwahammy commented 7 years ago

@twe4ked ah, that makes sense. How should constraints be enforced here? As an example, let's say a user could have up to 3 emails but no more. Should the constraint be enforced in the projection by validating the constraints there and sending a revert event if it fails (and not creating a new projection). Or should this be handled in the aggregate?

Based upon my understanding, the projection seems to make more sense but I wasn't sure.

stevehodgkiss commented 7 years ago

If an invariant must always be true, aggregate boundaries would need to be changed so that a single aggregate has the data required to enforce the rule. In this case the EmailAdded event would need to be emitted on the User aggregate vs a separate Email aggregate. This way it's possible to guarantee that a user can't add more than 3 emails. The aggregate method would look something like this:

def add_email(email)
  if @emails.count >= 3
    raise SomeError
  end
  apply_event(EmailAdded, body: { email: email })
end

An alternative is to let it happen and correct it afterwards if required. A projection could be used to validate the rule while handling the command, acknowledging that because the projection is updated asynchronously to the request to add an email, 2 concurrent requests to add an email could result in the rule being violated.

A reactor would be used to correct the race condition after it's happened. A reactor is a type of event stream processor that can keep an internal projection and also emit events back into the stream (an example reactor in the todo app). The reactor would keep track of the number of emails per user and if an EmailAdded event causes the number of emails for a user to go above 3, emit some kind of correction event such as EmailRemoved.