jamesmh / coravel

Near-zero config .NET library that makes advanced application features like Task Scheduling, Caching, Queuing, Event Broadcasting, and more a breeze!
https://docs.coravel.net/Installation/
MIT License
3.73k stars 244 forks source link

Questions: Multiple instances #101

Open jacodv opened 5 years ago

jacodv commented 5 years ago

Global Scheduled tasks This might be a broader question to just using Coravel as a task scheduler. How would schedule tasks that need to run once a for the application, but the service is actually running on more than one node in a load-balanced environment.

Possible solution When the first node gets that Invoke it sets a record in the database that the task was started. Subsequent tasks that get invoked todo the same daily job then stops after determining that another process is handling the task.

Potential issue

jamesmh commented 5 years ago

This is a feature I have planned to add - a distributed scheduler.

Another option right now is, if you know how many instances you have (how many nodes), then configure an env var for each one and put an if statement inside your scheduling logic. Only run certain tasks on certain nodes (on one node each).

Another, if possible, is to create a background service (console app) for all your schedules.

But yes, this would be a great feature that I've been hoping to build 👍

jacodv commented 5 years ago

@jamesmh: Wow great to get a fast response. I did some more research and it seems that a database/caching mechanism is the best option. I am going to insert a "Started" document into a MongoDB with a key that will be consistent across all instances, this will allow me to have only one instance that can insert the document (who will do the work), and the rest failing with a duplicate key error.

From your documentation I suggest, we add a CanExecute(<Action>) allowing the implementation to return a boolean if this instance can start the process (task).

We do many things that your library makes easier like queues (RabbitMQ), Caching (MongoDB) etc. I will keep this project in mind.

jamesmh commented 5 years ago

Nice. I was thinking of taking the approach that laravel takes (which I've already pre-architected out in Coravel's code). Right now, for example, calling PreventOverlap internally uses a shared mutex (by interface) to make these checks.

For a distributed scheduler, I need to create some distributed mutexes (one per concrete DB provider). The difficulty will be in maintaining all the various providers...so my struggle is to start with the most commonly used technology (SQL Server, Postgres, Mongo, Redis?)

My first hunch is that Postgres would be, overall, the best first step. The mechanics would be very close to what you had mentioned - using a table lock pattern where each row represents a lock for a specific task/job.

My second hunch is to start with Redis since it's a pretty "neutral" ground and a fairly easy tech to get up and running from the developer's perspective. If someone needs a distributed scheduler then chances are they already recognize that there's a commitment to a little extra infrastructure that will be needed.

jacodv commented 5 years ago

Sounds like a solid approach! 👍

jacodv commented 5 years ago

@jamesmh:

Managed to fix the config issue I had. Below is the working setup of scheduled tasks

/// Setup coravel scheduling
public static class ScheduleSetup
  {
    private static readonly ILog _log = LogManager.GetLogger(MethodBase.GetCurrentMethod().DeclaringType);

    public static void SetupScheduledTasks(this IServiceCollection services)
    {
      services.AddScheduler();
    }

    public static void ConfigureScheduledTasks(this IApplicationBuilder app, ScheduleSettings settings, IServiceProvider serviceProvider)
    {
      var provider = app.ApplicationServices;

      settings.Schedules.ForEach(schedule =>
      {
        _log.Debug($"{schedule.TaskName} IsEnabled:{schedule.IsEnabled}");
        if (!schedule.IsEnabled)
          return ;

        _log.Debug(schedule.Dump());

        var scheduledTask = new ScheduledTask(
          schedule.TaskName, 
          schedule.Type,
          IpHelper.GetLocalIPAddress());

        if(!(IocContainerConsole.Instance.Resolve(Type.GetType(schedule.IiabTaskType)) is IIiabTask iiabTask))
          throw new InvalidOperationException($"Failed to resolve {schedule.IiabTaskType} as IIiabTask");

        provider
          .UseScheduler(scheduler =>
          {
            scheduler
              .ScheduleAsync(
                async () =>
                {
                  _log.Debug($"Starting: {schedule.TaskName}|{schedule.IiabTaskType}|{schedule.Cron}");
                  var scheduleTaskManager = IocContainerConsole.Instance.Resolve<IScheduledTaskManager>();
                  if (scheduleTaskManager == null)
                    throw new InvalidOperationException("Failed to resolve the IScheduledTaskManager");

                  try
                  {
                    await scheduleTaskManager.SaveScheduledTask(scheduledTask);
                    // If success then this task may execute, else another machine is executing the task
                    await iiabTask.Execute();
                  }
                  catch (AggregateException agEx)
                  {
                    if (!agEx.ToDetailedFirstExceptionOfExceptionString().Contains("same key"))
                      throw agEx.ToFirstExceptionOfException();

                    _log.Info($"{scheduledTask.Id} is executing elsewhere");
                  }
                  catch (Exception ex)
                  {
                    if (!ex.Message.Contains("same key"))
                      throw;

                    _log.Info($"{scheduledTask.Id} is executing elsewhere");
                  }

                })
              .Cron(schedule.Cron)
              .When(iiabTask.CanExecute);
          })
          .OnError(taskEx => _log.Error($"Task: {scheduledTask.Id} failed to complete: {taskEx.Message}", taskEx))
          .LogScheduledTaskProgress(serviceProvider.GetService<ILogger<IScheduler>>());
      });
    }
////

public class ScheduleSettings
  {
    public ScheduleSettings()
    {
      Schedules = new List<Schedule>();
    }
    public List<Schedule> Schedules { get; set; }
  }

  public class Schedule
  {
    public bool IsEnabled { get; set; }
    public string TaskName { get; set; }
    public ScheduleType Type { get; set; }
    public string IiabTaskType { get; set; }
    public string Cron { get; set; }
  }
jamesmh commented 5 years ago

Awesome. Haven't had a chance to check it out. Glad you nailed it down!

caizhiyuan commented 4 years ago

When will this feature be available?

jamesmh commented 4 years ago

Probably not anytime soon TBH 😥 I'd like to but not a priority among life's various challenges right now 😉

Memhave commented 4 years ago

I'm also interesting in this - does any body know how it handles itself on a cluster like kubernetes with multiple replicas?

jamesmh commented 4 years ago

Each instance will execute the schedules. So if 5 instances exist (for example), then 5 jobs will be executed in parallel.

iwt-iradosevic commented 1 year ago

any news on this?

sirdawidd commented 1 year ago

Would be nice to have such feature, especially if nowadays we are using kubernetes with multiple pods.

jamesmh commented 11 months ago

FYI there's some guidance on this scenario that you can read here -> https://github.com/jamesmh/coravel#does-coravel-support-distributed-locking.

johnwc commented 5 months ago

Any ETA on when this feature can be introduced? It is what is keeping us from moving from hangire to Coravel.