hibiken / asynq

Simple, reliable, and efficient distributed task queue in Go
MIT License
9.59k stars 693 forks source link

[FEATURE REQUEST] maxArchiveSize and archivedExpirationInDays should be configurable at the server level #441

Open ghstahl opened 2 years ago

ghstahl commented 2 years ago

Configurable MaxArchiveSize and ArchivedExpirationInDays

There are some tasks I don't care if they eventually fail and would prefer, they just get thrown away.

In my application I spin up multiple asynq servers (hosted in a go routines). Some I want to have failed tasks archived, other servers I don't care. It's a house keeping thing to not write stuff to REDIS when nobody cares about them.

I would like to configure maxArchiveSize and archivedExpirationInDays in the config

Something like this

type Config struct {
  MaxArchiveSize  int
  ArchivedExpirationInDays int
}

When I don't care about the tasks, MaxArchiveSize and ArchivedExpirationInDays would be ZERO.

Alternately

Via @hibiken If you are interested in querying all the archived tasks, and delete them programmatically, you can use Inspector.ListArchivedTasks and Inspector.DeleteTask :)

For this I would probably use the asynq scheduled tasks to wipe out those tasks in archives I don't care about.

hibiken commented 2 years ago

@ghstahl Thank you for opening this issue.

I think it makes sense to allow customizing those parameters via Config.

But does it make more sense for your use case to return a sentinel error from the Handler to tell Asynq not to store the failed task in archive? We can similar one called SkipRetry which skips any remaining retries and sends the task directly to archive. Maybe we could introduce another sentinel error to indicate Asynq to drop the task?

ghstahl commented 2 years ago

The handlers in the design have no knowledge of the context in which they are used. For a particular server, returning errors is just fine and it's fine to flow into the archive. For another server, with the handlers behaving the same way, we want the archive excluded.

The issue that popped out to me going down the sentinel error path is that normal errors emitted from the handler are fine as long as we are in the Retry window. For the handler to return the "Don't archive me" error is to know that this is the "last" retry and also the business context of which they were called.

I can see the sentinel error technique being used when handlers are designed to have way more knowledge than I give them and needing to know from asynq that we are at the end.

roccoblues commented 2 years ago

Hi @hibiken,

as mentioned here I started making those parameters configurable.

However I quickly realized that it's not so easy because there can be at least 3 triggers of the archiving:

  1. The asynq.processor in the asynq.Server. This is the easy one, here we have the config and can initialize the rdb.RDB with the correct values.
  2. The asynq.Inspector which can be used in custom code.
  3. The web interface also using asynq.Inspector.

The problem is that I don't see a way to configure all cases together. And if we don't do that the behavior will be totally confusing.

Any ideas?

ghstahl commented 2 years ago

Hi, I needed these 2 specifically to get me unstuck https://github.com/fluffy-bunny/asynq https://github.com/fluffy-bunny/asynq/blob/a5681d48b59b66d2e7043d9d6a60f8c92fd03dbc/internal/rdb/rdb.go#L33

The technique I used was to target these 2 configs only.

type RDBConfig struct {
    MaxArchiveSize           *int
    ArchivedExpirationInDays *int
}

type InspectorConfig struct {
    MaxArchiveSize           *int
    ArchivedExpirationInDays *int
}

If the value is nil use the defaults as before.

Don't archive anything based upon the configs.

func (r *RDB) Archive(ctx context.Context, msg *base.TaskMessage, errMsg string) error {
    if *(r.config.MaxArchiveSize) <= 0 {
        return nil
    }
...
}

I hastily did this, so I didn't think it would be accepted as main as it is. I would appreciate your feedback on the technique.

bojanz commented 1 year ago

Big +1 to this feature request.

I do not want tasks to be archived at all, but right now there is no way to do that.

ghstahl commented 9 months ago

Tangentially related https://github.com/hibiken/asynq/issues/756