ormushq / ormus

Ormus - Customer Data Platform
MIT License
28 stars 19 forks source link

destination: Bug: Task Locking Should Occur Before Status Check to Avoid Race Conditions. #103

Closed Ali-Farhadnia closed 3 months ago

Ali-Farhadnia commented 3 months ago

There is a potential race condition in the HandleTask function where the task status is checked before the task is locked. This can lead to situations where the task status might be altered by another process between the status check and the lock acquisition. Steps to Reproduce

Code Changes Required

Move the task locking section before the status check and ensure proper unlocking. Here is the specific section that needs to be changed:

Original Section:

// Get task status using idempotency in the task service.
if taskStatus, err = s.GetTaskStatusByID(ctx, taskID); err != nil {
    span.AddEvent("error-on-get-task-status", trace.WithAttributes(
        attribute.String("error", err.Error())))

    // todo use richError
    return err
}

// If task status is not executable, we don't need to do anything.
if !taskStatus.CanBeExecuted() {
    slog.Debug(fmt.Sprintf("Task [%s] has %s status and is not executable", taskID, taskStatus.String()))
    span.AddEvent("task-can`t-be-executed")

    return nil
}

unlock, err := s.LockTaskByID(ctx, taskID)
if err != nil {
    span.AddEvent("error-on-lock-task", trace.WithAttributes(
        attribute.String("error", err.Error())))

    return err
}
defer func() {
    err = unlock()
    if err != nil {
        logger.L().Error(fmt.Sprintf("unlock task failed %s", err))
    }
}()

Updated Section:

unlock, err := s.LockTaskByID(ctx, taskID)
if err != nil {
    span.AddEvent("error-on-lock-task", trace.WithAttributes(
        attribute.String("error", err.Error())))

    return err
}
defer func() {
    err = unlock()
    if err != nil {
        logger.L().Error(fmt.Sprintf("unlock task failed %s", err))
    }
}()

// Get task status using idempotency in the task service.
if taskStatus, err = s.GetTaskStatusByID(ctx, taskID); err != nil {
    span.AddEvent("error-on-get-task-status", trace.WithAttributes(
        attribute.String("error", err.Error())))

    // todo use richError
    return err
}

// If task status is not executable, we don't need to do anything.
if !taskStatus.CanBeExecuted() {
    slog.Debug(fmt.Sprintf("Task [%s] has %s status and is not executable", taskID, taskStatus.String()))
    span.AddEvent("task-can`t-be-executed")

    return nil
}

Another Improvment:

Handling panic in the defer function and calling unlock function:

defer func() {
    if r := recover(); r != nil {
        unlockErr := unlock()
        if unlockErr != nil {
            logger.L().Error(fmt.Sprintf("unlock task failed %s", unlockErr))
        }
        panic(r) // Re-throw the panic
    }
    unlockErr := unlock()
    if unlockErr != nil {
        logger.L().Error(fmt.Sprintf("unlock task failed %s", unlockErr))
    }
}()

}()