red-gate / Tech-Radar

A Tech Radar for Redgate
21 stars 14 forks source link

[Framework] Event handling and threading management #154

Closed chenchen1987 closed 5 years ago

chenchen1987 commented 5 years ago

What change would you like to make to the tech radar?

Add Akka.net

Why do you believe this is valuable to Redgate?

@red-gate/spiders team is currently exploring options to handle events in a thread safe way. To give a context, Data Catalog needs to be able to maintain an updated schema of users' databases, by scheduled scanning of users' lives databases. The decision was made that, this will be achieved in an event driven approach. Akka.net was explored into before Tech-Radar was introduced. For the arguments that we would build towards a stable and scalable product. We may be looking to use Akka.net in its simplest form, with the assumption of the following benefits:

Where should this be on the tech radar?

Exploration

If this should be in the Explore ring, who is committed to exploring it?

@red-gate/spiders will explore it within a time box.

Please suggest other libraries/frameworks if you think it might help us!

@fffej we very briefly talked about it couple of weeks ago. You mentioned that you may have some other suggestions?

fffej commented 5 years ago

I guess my main worry with Akka.Net is that it's a big package that has knock-on implications for any future maintenance. I'm skeptical whether it is something that can be lean and switched out - it's much more of a framework rather than a swappable library.

I don't know the details of the problem you are trying to solve but you mentioned scheduled scanning? Might be something you could accomplish with FluentScheduler?

Perhaps you could share more details? (maybe in private if there's anything commercially sensitive - this repo is public!).

idursun commented 5 years ago

I think the Actor Model itself is worth exploring.

Akka.Net is just a mature implementation of the Actor Model in the .Net world. There are less mature alternatives like proto.actor

If the problem at hand is scheduling and thread management, then it definitely feels like an overkill and there are libraries out there that do solve these problems elegantly.

Akka is well suited for scenarios where you want to build highly concurrent, distributed, self-healing systems. Those are the areas where it really shines. Thread management and scheduling are usually the problems that need to be solved to be able to build such a system and using a library like Akka makes these concerns irrelevant as they are already solved by the actor model and handled by the implementing library.

Apart from the core library, every added functionality of Akka.Net comes in the form of extensions. (remoting, clustering, persistence, etc). The core library is very lightweight. (Orleans feels more like a framework but Akka.Net is a library)

As far as Actor Model concepts go (i.e. Actors, Supervisors, Messages) an implementation of an actor should not be too hard to convert to the implementation for another implementing library.

chenchen1987 commented 5 years ago

Our basic requirements are:

At this stage, I believe the team has little expertise in Akka.net, so totally open to alternatives.

nyctef commented 5 years ago

This might be a dumb question, but what requirements do you have that aren't satisfied by basic TPL tasks running in the thread pool?

chenchen1987 commented 5 years ago

This might be a dumb question, but what requirements do you have that aren't satisfied by basic TPL tasks running in the thread pool?

We want to be saving ourselves time by not having to reinvent the wheel. Implementing it from scratch would mean as well as focusing on business requirements, we'd also need to think about performance, fault tolerance etc.

nyctef commented 5 years ago

What's the thing that you would need to reinvent?

chenchen1987 commented 5 years ago

For example, the behaviour of actors and mailboxes. Another thing that needs to be taken into consideration is the scalability of the implementation. I am far from being an expert to advise you on the exact details, so all my arguments and assumptions are based on blogs, documentations and some very primitive experiments.

nyctef commented 5 years ago

Ah, I see. I guess the main thing to question is whether you need an actor model in the first place. From the (admittedly very little) I've heard about the requirements in this issue, it sounds like significant overkill, but of course you'll be the ultimate judge of what's right for you. I'll be interested to hear how it goes if you do end up exploring actors :)

ChrisLambrou commented 5 years ago

We want to avoid any library, such as akka.net, from bleeding into our codebase. What we actually care about are the following:

  1. Being able to easily implement "workers" or "event handlers". Long-running components that are capable of responding to specific types of event that represent meaningful occurrences in our problem domain. Imagine multiple implementations of something like interface IEventHandler { Task HandlerEvent(Event @event); }.

  2. Being able to easily post or broadcast events, not knowing exactly which handler will actually deal with them. i.e. anything that needs to post events would take a dependency on something like interface IEventDispatcher { void PostEvent(Event @event); }.

We'd need a component to coordinate all dispatched events and route them to the right event handlers. I hope that would be the sole class, or group of classes, in our codebase that would take a direct dependency on something like akka.net. For example, we could wrap each IEventHandler in an Actor, have an Actor-based implementation of IEventDispatcher, and then glue them together in the right way.

We could roll our own, but we'd rather avoid that effort by making use of some library to do most of the heavy lifting for us. It'd be more valuable to put our effort on implementing those event handlers, to do real useful things, rather than working on infrastructure.

fffej commented 5 years ago

Thanks @ChrisLambrou. I'm at the risk of over-simplifying here, but as @nyctef I have a sense that akka.net sounds like overkill. I'm all for using technology if it helps us go faster, but I also want to make sure I ask the right questions so we don't end up with a technology we don't need.

It sounds like this is a producer/consumer to address point 1? (TPL looks like it could help here).

The message dispatching to the right handlers sounds like it could be solved any number of ways and I can't immediately see why it would need something like Akka to solve?

nyctef commented 5 years ago

We'd need a component to coordinate all dispatched events and route them to the right event handlers.

The message dispatching to the right handlers sounds like it could be solved any number of ways and I can't immediately see why it would need something like Akka to solve?

IIRC we implemented a similar pattern in CompareUI in about a hundred lines of code (although I think we actually copied it from the Oracle tools at the time). It would need some tweaking to handle Tasks properly, but the basic idea is there :)

We could roll our own, but we'd rather avoid that effort by making use of some library to do most of the heavy lifting for us. It'd be more valuable to put our effort on implementing those event handlers, to do real useful things, rather than working on infrastructure.

For the really simple case, I think it might be roughly equivalent amounts of work to build it from scratch as it would be to write the code to integrate with + abstract away Akka (or whichever library you choose).

The real value from Akka might come from some of the extensions that @idursun mentioned - if the fact that you're running Akka under the hood means you get a bunch of useful secondary functionality "for free", that might be worth the cost of entry?

ChrisLambrou commented 5 years ago

You're right. We're deliberately not buying in to the full actor model. We've described a set of interfaces that describe our minimal requirements, and we're just looking for help to implement them. If the core of akka.net helps with that, then fine. But we definitely want to avoid a deep dependency on it, or anything else, if we can get away with it.

ChrisHurley commented 5 years ago

Not sure if this is adding much to the discussion, but I can see some value in exploring the area. SQL Clone has long-running processes (using Tasks and our own handling of them, including e.g. this) and has a couple of different event messaging systems (using SignalR this way to our agent and MemBus internally).

cjheppell commented 5 years ago

Data Catalog needs to be able to maintain an updated schema of users' databases, by scheduled scanning of users' lives databases. The decision was made that, this will be achieved in an event driven approach.

I wonder if you might have jumped the gun a bit here. Given that this is a new product we're developing and we're trying to achieve product-market fit, I'd urge caution on striving for a "perfect" solution which allows you to build concurrent and fully distributed systems. That seems like a problem for later ™. Right now, you want to demonstrate the solution you build solves the problem you've identified. After you've done that, you can iterate on your existing solution and improve it.

I'd recommend going for the MVP here. You say you want "scheduled scanning of users' live databases". What's wrong with writing a program which runs on a schedule to scan the database and send it back to whichever endpoint requires that data? IMO, requiring things to run on a schedule absolutely does not imply I need an actor-model implementing framework that helps me to build distributed, concurrent applications.

I'm not saying you won't necessarily need it in the future, but based on your description it seems like this is overkill right now.

Also, to piggyback on what @fffej and @nyctef have already said, the TPL can do a lot of the heavy lifting for you here. When we built DPS, we made use of the TPL data flow components to handle our streamed (parallel!) extraction & uploading of a users database to their Azure data warehouse. This was straightforward to put behind a sensible interface, and quick to implement without having to understand a new paradigm.

I think there's a time and a place to validate that something like Akka could help streamline our development. But I'm not convinced this is it.

ChrisLambrou commented 5 years ago

Okay, I'm going to close this issue. It's clear that Akka.net doesn't need to go into the tech radar under Explore. I'm not wholly convinced we actually need to use it, and even if we actually do take a dependency on it, it won't be in a way that means we'd be recommending or advocating it's exploration or adoption here. It also wouldn't be an architectural decision for the SQL Data Catalog (i.e. it wouldn't be costly to change our minds and ditch it in favour of something else).

ChrisLambrou commented 5 years ago

Data Catalog needs to be able to maintain an updated schema of users' databases, by scheduled scanning of users' live databases. The decision was made that, this will be achieved in an event driven approach.

I should perhaps also clarify that our decision to have an event driven approach was not driven by this particular need for scheduling. There's a bigger picture beyond the scope of this particular GitHub issue and the scheduling of work is just something that comes relatively cheaply given an event-based architecture. We were really just contemplating whether or not Akka.net could be useful in the implementation of a small component of SQL Data Catalog. For us, the jury is still out, but unless and until we've had a non-trivial play with Akka net, it's probably not worth advocating that others should also explore it's potential.

chenchen1987 commented 5 years ago

Discussed with Jeff. Actor model will not be appropriate for our purposes, given the stage we are at with the Data Catalog. Therefore Akka.net will not be explored by the Spiders team.