woocommerce / action-scheduler

A scalable, traceable job queue for background processing large queues of tasks in WordPress. Specifically designed for distribution in WordPress plugins (and themes) - no server access required.
https://actionscheduler.org
GNU General Public License v3.0
629 stars 114 forks source link

WP CLI worker #746

Closed ancient-spirit closed 12 months ago

ancient-spirit commented 3 years ago

Instead only using cron to run wp cli, I believe adding ability like worker/watcher/listener is better for many use case and background processing.

you can take look the Laravel's queue for reference https://laravel.com/docs/8.x/queues#running-the-queue-worker

WPprodigy commented 3 years ago

This is closer to the spirit of Laravel's queue worker: https://github.com/Automattic/Cron-Control. Turns WP cron into the worker/watcher model, and runs jobs through CLI.

Not exactly the same once AS is involved, but then it's just a matter of starting up AS queues through cron jobs as often as needed.

barryhughes commented 3 years ago

This sounds like a valuable discussion and I definitely agree we could expand the range of options available for processing queues.

I believe adding ability like worker/watcher/listener is better for many use case and background processing.

How are you currently running the queue, and, what specific problems are you facing with the current model? Getting to grips with some specifics could help us as we develop our understanding of current challenges—though we already know about some of these through existing feedback, and can imagine others, we'd love to hear about your own production experience.

likeadeckofcards commented 2 years ago

It would be nice to have a command that would wait for jobs to hit the action scheduler. Right now I have it configured to only run on on the CLI using the wp action-scheduler run command. However, this task will stop as soon as any backlog has been caught up and processed. It would be nice to use a tool like Supervisord to watch the process and keep it running even when the backlog is empty. My team and I have faced issues where tons of processes were getting created when trying to keep the queue running even when the backlog was empty. (Still trying to resolve this issue)

WPprodigy commented 2 years ago

For some more context on the cron-control model of scaling AS:

1) The cron-control plugin moves WP cron events into their own table, and exposes a CLI & REST interface for getting/running cron events.

2) There is a runner, built in Go, that consistently looks for "due" cron events and runs them - allowing for true parallelism if the container/host has the threads for it: https://github.com/Automattic/Cron-Control-runner

3) By default AS has 1 recurring cron job that processes actions, but this has limitations if talking about a truly large number of actions as you need concurrent queues processing to prevent falling behind. So with the runner mentioned above, we just need to schedule some extra cron events that will run action scheduler queues concurrently, and that is where this mini-plugin comes in: https://github.com/Automattic/vip-go-mu-plugins/blob/79427058e33551f34009ae89b79c68a82a16aee3/cron/action-scheduler-dynamic-queue.php#L9-L21. It checks if things are falling behind, and schedules cron events to run that will each kick off a concurrent AS queue.


The implementation could be greatly simplified if you can tailor the scale manually rather than having to dynamically scale. As an example:

1) Create a custom WP CLI command (wp custom-as-queue) that basically just runs do_action( 'action_scheduler_run_queue', 'my custom queue' ). This will process actions in a queue for however long action_scheduler_queue_runner_time_limit is set to (lets say 120seconds), then will kill itself off. You don't want to go too high on a time limit, else memory can start to fill up and slow down the request.

2) Have a script that runs wp custom-as-queue every ~120seconds.

3) Make something that can trigger a concurrent amount of those scripts, whatever the container/host in question can reasonably handle.

alan-la-chen-478 commented 2 years ago

I'd also like to see this as an option for advanced developers.

Ideally would be nice to have a constant that we can define to disable all the wp_remote_post at the end of the shutdown action. define('AS_DISABLE_SHUTDOWN_TRIGGERS', true); or something like that.

Also a new command that doesn't exit out when no task left. Instead, it'll just do a sleep and wait for new tasks to come in. wp action-scheduler listen or something like that.

With that, then one can setup supervisor or pm2 to keep that process running on the server. Even the process timed out or exit, it'll re-spawn a new one.

With this, it can also prevent error on some "managed WordPress hosting", who's name shall not be named, throttles or limits the wp-ajax calls to "improve" performance.

barryhughes commented 2 years ago

Thanks for all the notes and ideas.

We did do some light experimentation at the end of last year, and looked at a model involving a single queue server running alongside one or more queue clients (each of those being its own process, stood up by a WP CLI command).

The idea was the server continually monitored the queue and fed actions to the next available client. In suitable environments, this increases concurrency greatly and (assuming normal queue runners are disabled) eliminates the potential for database deadlocks during the claim process, because it effectively eliminated the idea of claiming blocks of actions.

However, as the above comments show, there are lots of other ways to tackle this general problem (many of which are simpler and therefore may be more robust).

Ideally would be nice to have a constant that we can define to disable all the wp_remote_post at the end of the shutdown action. define('AS_DISABLE_SHUTDOWN_TRIGGERS', true); or something like that.

We do have an existing hook that can be used to prevent this:

add_filter( 'action_scheduler_allow_async_request_runner', '__return_false' );

Does that meet your needs, or are there cases it doesn't cover?

alan-la-chen-478 commented 2 years ago

Somehow I missed that filter. That's good to know. Thanks.

WPprodigy commented 2 years ago

Also a new command that doesn't exit out when no task left. Instead, it'll just do a sleep and wait for new tasks to come in.

Have to be really cautious of memory leaks throughout the application if you go this route. Things like the local object cache or DB query history will continually fill up over time, eventually leading to an OOM, at usually a bad time.

Running commands on a schedule that kill themselves off based on a reasonable timelimit just helps avoid having to chase down various plugins that might have memory leaks (which can be a lot since the normal php request cycle doesn't really punish memory very often)

barryhughes commented 2 years ago

The above, plus (depending on how things are configured) different types of problems relating to cached values can surface, and can be especially problematic in very long running processes.

lsinger commented 12 months ago

We looked at this again and didn't see an actionable next step or indeed sustained interest / an urgent need for this enhancement. I'll close this issue for now, but if anyone would like to add more context about why this enhancement is important, how it should work, and what it would enable, then feel free to reopen.