joomla-projects / soc21_website-cronjob

GNU General Public License v2.0
5 stars 3 forks source link

Resumable / long running tasks #59

Closed nikosdion closed 2 years ago

nikosdion commented 2 years ago

There are some kind of tasks which might take a long time to execute. For example sending a newsletter to hundreds or thousands of people; batch processing a large number of uploaded images and videos; synchronising local assets with an external server; taking backups etc.

These tasks cannot run to completion even if you set an infinite PHP execution time limit because a time limit external to PHP (Apache connection timeout, ulimit -t, ...) may intervene and kill the process at an inopportune moment.

The way these tasks have been traditionally implemented when CLI CRON jobs are not available is to halt the execution at a predefined point in time and mark the task as incomplete. The corollary to that is that the task would resume the next the scheduler checks for tasks to execute, NOT the next time the task would normally be scheduled to execute.

This is a point I raised in https://github.com/joomla-projects/soc21_website-cronjob/issues/39, see items 1 and 3.

Seeing the implementation in the current 4.1-dev branch it would appear that this kind of tasks are outright impossible unless they are marked as CLI only. However, this makes no sense since people who can set up CLI CRON jobs can already use the CLI commands we 3PDs offer in our extensions anyway.

The problem I see is that Joomla\Component\Scheduler\Administrator\Task\Task::run() treats any non-OK exit code as an error, increasing the times_failed. Moreover, it always sets the next_execution according to the schedule, regardless of the exist status.

For resumable tasks to work you need another status, let's call it Status::MORE_WORK_NEEDED. If the task returns this status its next execution time should be set to new \Joomla\CMS\Date\Date() and times_failed should not be increased. No other changes seem to be necessary.

This would allow lazy scheduling of long running tasks which would be beneficial to sites which do not have a real CLI CRON job system but have a. a pseudo-CRON which allows them to access a URL periodically (which could be used to trigger Joomla's scheduler every minute!) or b. receive a steady flow of traffic.

The second sub-case (steady flow of traffic) is dead simple to guarantee these days, unlike what happened back in 2010. You could use a free site monitoring service such as HetrixTools to have a specific URL accessed every minute. Even a less–than–tech–savvy end user can plausibly set this up given simple instructions. We've already tested that theory with the slightly more complicated — and paid — WebCRON service in our software for years, even the completely non–technical end users manage to follow these instructions.

Would you like me to do a draft PR so you can see what I have in mind and tell me if I am missing something?

ditsuke commented 2 years ago

@nikosdion

Thank you for opening this issue. A draft PR would be very welcome!

nikosdion commented 2 years ago

I made a full PR :) In the process of writing the draft PR I found out all the issues I hadn't thought of and, well, addressed them.

Now that I had more time to study the code please let me tell you that I am VERY impressed at the quality and the thought which went in it. It's the first time in years I am ecstatic about new code being added to Joomla, let alone code added in a minor release. Very well done, sir!

I am closing this issue since there's now a PR.

ditsuke commented 2 years ago

I made a full PR :) In the process of writing the draft PR I found out all the issues I hadn't thought of and, well, addressed them.

Now that I had more time to study the code please let me tell you that I am VERY impressed at the quality and the thought which went in it. It's the first time in years I am ecstatic about new code being added to Joomla, let alone code added in a minor release. Very well done, sir!

I am closing this issue since there's now a PR.

I'm pleased and honored to hear that our work stands out! The PR looks perfect and exactly how I'd imagined this feature should work. Thank you for your efforts and contributions! 🥂