dotnet-labs / ServiceWorkerCronJob

Schedule Cron Jobs using HostedService in ASP.NET Core
MIT License
265 stars 71 forks source link

CronJobService service change proposals #17

Open Manthazar opened 6 days ago

Manthazar commented 6 days ago

Hi there,

I cannot open a new PR for security reasons, but I had taken gratefully this code base and combined it with previous lessons learned. You may find them useful.

I have addressed the following points:

  1. The service CronJobService was blocking the hosting pipeline. Based on this MS article, the cron scheduler has been put into an executing task.
  2. The service is using recursive calls to schedule and wait for next. This can result in StackOverflowExcetions especially in quickly timed, long living servers. In such case, the application just shuts down without error (i.e. potentially difficult to find, 'it just stopped') This has been replaced with SemaphoreSlim.
  3. The DoWork method was not handling exceptions. In case an exception occurs, the entire scheduler would fall down. This is typically not what we want as in the error could be easily intermittent (i.e. network failure)
  4. CancellationTokens do have a method ThrowIfCancellationRequested which throws OperationCanceledException representing a stop signal. Added a catch here to not log it as error.
  5. I think, Stop is not correctly implemented because all it does is to stop the timer. However, the Scheduler code is recursively replacing the timer as long as GetNextReoccurrence has a value. Also, the service completes Stop while it is actually still executing user code. Added a CancellationTokenSource to signal (user) code to stop.
  6. Flavoured: Added the logger to the base layer to ensure critical logs appear in the same way; the logs can be searched by a prefix (i.e. one can prepare filters easily).
public abstract class CronJobService {
private readonly SemaphoreSlim schedulerCycleEvent = new SemaphoreSlim(0);
private readonly CronExpression expression;
private readonly TimeZoneInfo timeZoneInfo;
private readonly ILogger logger;

protected readonly string LoggerName;

private CancellationTokenSource? localCancelationSource;
private Task? executingTask;
private Timer? timer;

public CronJob(string cronExpression, TimeZoneInfo timeZoneInfo, ILogger logger)
{
    this.expression = CronExpression.Parse(cronExpression);
    this.timeZoneInfo = timeZoneInfo;
    this.logger = logger;
    this.LoggerName = $"Cron job {GetType().Name} ";
}

public virtual Task StartAsync(CancellationToken cancellationToken)
{
    logger.LogInformation($"{LoggerName}: started with expression {expression}.");

    localCancelationSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
    executingTask = ScheduleJob(localCancelationSource.Token);

    if (executingTask.IsCompleted)
    {
        return executingTask;
    }
    else
    {
        return Task.CompletedTask;
    }
}

protected async Task ScheduleJob(CancellationToken cancellationToken)
{
    try
    {
        while (cancellationToken.IsCancellationRequested == false)
        {
            var next = expression.GetNextOccurrence(DateTimeOffset.Now, timeZoneInfo);

            if (next.HasValue)
            {
                logger.LogInformation($"{LoggerName}: scheduled next run at {next.ToString()}");

                var delay = next.Value - DateTimeOffset.Now;
                if (delay.TotalMilliseconds <= 0)   // non-positive values somewhat indicate that between get next occurence and he we crossed the line.
                {
                    logger.LogWarning($"{LoggerName}: scheduled next run is in the past. Moving to next.");
                    continue;
                }

                timer = new Timer(delay.TotalMilliseconds);
                timer.Elapsed += async (_, _) =>
                {
                    try
                    {
                        timer.Dispose();  // this timer has served well, the next cycle wil use a new one.
                        timer = null;
                        if (cancellationToken.IsCancellationRequested == false)
                        {
                            await DoWork(cancellationToken);
                        }
                    }
                    catch (OperationCanceledException)
                    {
                        logger.LogInformation($"{LoggerName}: job received cancellation signal, stopping.");
                    }
                    catch (Exception e)
                    {
                        logger.LogError(e, $"{LoggerName}: an error happened during execution of the job: {e.Message}");
                    }
                    finally
                    {
                        // We let the outer loop now that the timer has been processed and the next occurrence can be calculated.
                        schedulerCycleEvent.Release();
                    }
                };

                timer.Start();
                await schedulerCycleEvent.WaitAsync(); // Wait nicely for any timer result.
            }
            else
            {
                logger.LogWarning($"{LoggerName}: next runtime returned no value. Something is fishy in the scheduler or the cron expression parser");
            }
        }
    }
    catch (OperationCanceledException)
    {
        logger.LogInformation($"{LoggerName}: job received cancellation signal, stopping.");
    }

    await Task.CompletedTask;
}

public abstract Task DoWork(CancellationToken cancellationToken);

public virtual async Task StopAsync(CancellationToken cancellationToken)
{
    logger.LogInformation($"{LoggerName}: stopping.");
    timer?.Stop();
    timer?.Dispose();

    if (localCancelationToken is not null)
    {
        await localCancelationToken.CancelAsync();
    }

    logger.LogInformation($"{LoggerName}: stopped.");

    await Task.CompletedTask;
}

public virtual void Dispose()
{
    timer?.Dispose();
    executingTask?.Dispose();
    schedulerCycleEvent?.Dispose();
    localCancelationToken?.Dispose();

    GC.SuppressFinalize(this);
}
}
changhuixu commented 5 days ago

Hi @Manthazar I've updated the code according to your suggestions. Could you take a look? Thank you.