bastion-rs / bastion

Highly-available Distributed Fault-tolerant Runtime
https://www.bastion-rs.com
Apache License 2.0
2.79k stars 103 forks source link

Usage of all possible resources for actors #155

Open Relrin opened 4 years ago

Relrin commented 4 years ago

Is your feature request related to a problem? Please describe. I would like to use all possible resources for spawning / handling actors on the machine as the default behaviour instead of setting up the limited amount of actors for the supervisor before starting up.

This feature can be considered as the improvement for the bastion, because I see the bastion as the good potential alternative to the Akka framework that exists in the Scala ecosystem. Based on this improvement, it will open an opportunity to handle any amount of requests/tasks in runtime and scale with ease.

Describe the solution you'd like In actual moment of time bastion (as far as I understood) forces to define an amount of actors for the supervisor before it starts, if as a developer I want more than 1 actor.

As the potential improvement it could be great to specify the "spawn strategy" for the defined actors: based on some limits (as the default behaviour) or spawn in runtime by the necessity.

For example, we can define the enum with strategies:

enum ActorSpawnStrategy
{
    Limit(usize),  // Default, used by the supervisor with the `.with_redundancy` calls
    OnDemand,   // Create workers on demand in runtime
}

// Some implementation for ActorSpawnStrategy, the Default trait for the enum, etc.
// ...

After adding a support for this feature, we can define the bastion configuration as:

Bastion::children(|children| {
        children.with_spawn_strategy(ActorSpawnStrategy::OnDemand)
}).expect("Couldn't create the children group.");

So, that at the beginning we have only the supervisor, but after the moment when we retrieve any request, the supervisor will spawn an appropriate actor in runtime, do something and gracefully stop the worker after the execution.

Describe alternatives you've considered As the alternative with current state of code, I see the only way to handle this case is to calculate the spawned actors in code based on the the size of the bastion's actor and pass the result into constructor, something like this:

use bastion::prelude::*;

fn main() {
    Bastion::init();

    let amount = 1; // calculations limits here, based on the amount of available RAM 

    Bastion::children(|children| {
        children.with_redundancy(amount)
    }).expect("Couldn't create the children group.");

    Bastion::start();
    Bastion::stop();
    Bastion::block_until_stopped();
}

However, this way not so flexible, because it will spawn N workers, based on the calculated value.

o0Ignition0o commented 4 years ago

Hey there :)

I really like the approach, and there's a couple of ways we could check for idle children and spawn / kill them accordingly.

I think it would make a lot of sense, OTOH I wonder how long spawning a child would take (depending on a child lifecycle it might add some latency and introduce non determinism the duration it takes to complete a task)

I don't know the bastion intrinsics well enough for now, but it's probably something that can be explored

vertexclique commented 4 years ago

This is exactly the case (autoscaling workers) mentioned by @rtyler. Please correct me if I am wrong?

The simplest case would be creating a Resizer like https://doc.akka.io/docs/akka/current/routing.html#dynamically-resizable-pool and reuse the existing code to configure the resizer. As an initial set of algorithms for autoscaling and descaling down it would be nice to take a look at the Akka docs.

Having a separate construct that spawns with 1 as redundancy and encapsulates resizer algorithms to dynamically spawn and stop would be really nice.

Relrin commented 4 years ago

The simplest case would be creating a Resizer like https://doc.akka.io/docs/akka/current/routing.html#dynamically-resizable-pool and reuse the existing code to configure the resizer. As an initial set of algorithms for autoscaling and descaling down it would be nice to take a look at the Akka docs.

This is also a good strategy for handling incoming traffic. I think it could be considered as the part of this feature request.

vertexclique commented 4 years ago

@Relrin yes, that's the intention. We will put this on the roadmap.

Relrin commented 4 years ago

I think it would make a lot of sense, OTOH I wonder how long spawning a child would take (depending on a child lifecycle it might add some latency and introduce non determinism the duration it takes to complete a task)

One of the ways to handle it is to define a timeout / time-to-live durations, so that it will kill workers that work not good enough. It can happen when the actors spend too much time for processing something (http requests, long-running tasks) and needs to reorganize the code (pipeline the processing?).

Relrin commented 4 years ago

I have a couple questions related to the following piece of code of code: 1) Does it mean the Children is actually a kind of sub-supervisor that runs N actors in any moment of time? Correct me, if I am misunderstood the idea. 2) Is it correct that this call will reset all entire group of actors if something goes wrong (e.g. when will fail the only one of the group)? 3) What about implementing a replacement for it, so that we could have a struct which implements the ActorSpawnStrategy trait:

    trait ActorSpawnStrategy {
        fn spawn(&self);
    }

    pub struct Children {
        bcast: Broadcast,
        init: Init,
        // Drop the field and let the struct with the `ActorSpawnStrategy` trait hold any data 
        // redundancy: usize
        // Any default strategy or any desired user-defined
        spawn_strategy: Box<dyn ActorSpawnStrategy>,
        // ...
        started: bool,
    }

After having an implementation for it, we can create a periodic task (std::task vs async_std::task?) that will do some checks and append/remove actors in runtime

r3v2d0g commented 4 years ago

Hey @Relrin :wave:!

I'll answer your first two questions since I authored this piece of code :).

  1. You understood correcly. Children is indeed running N = self.redundancy actors (or Child, in this case) at all time.
  2. This is also true and there is (was? ;)) similar code in Supervisor.

Cheers!