biow0lf / evedev-kb

Automatically exported from code.google.com/p/evedev-kb
1 stars 0 forks source link

Task Queue #169

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
=What is the enhancement you are adding?=
A task queue to replace cronjobs where possible. Task queue will be triggered 
each page load, and each task kept small enough to avoid slowing page load. The 
task queue will be runnable as a cronjob itself.

Original issue reported on code.google.com by kovellia on 23 Nov 2011 at 12:58

GoogleCodeExporter commented 9 years ago
Kovell,  I have a nice idea for cron scripts, and it is a bit crazy...

First, put all the cron scripts into one php file, with various sections turned 
off and on via the admin CP (include what times to run).   Make the index.php 
file trigger the cron script.

If its not time for the cron to run, the index.php runs as normal,  if it is 
time, or past due, it updates.

If you think about it, it should eliminate the need to rely on the systems cron 
function as it will be updated when the time is right and when index.php is 
used.  In this configuration, a cron system is optional, not required.

I am thinking of making a small mod for this...  

Original comment by wyatt@podkill.net on 5 Dec 2011 at 7:16

GoogleCodeExporter commented 9 years ago
The problem is that the cron jobs that matter are large and will very easily 
time out, let alone load the page in acceptable time for the user. AJCron does 
pretty much what you suggest already. It also hides the delays by calling the 
cron job via a hidden javascript method. However, it's unreliable. It has 
problems with race conditions and can easily fail due to the extra problems 
introduced by the browser.

The aim here is to do close to what you suggest, but split into smaller jobs. 
Anything that goes into the queue is tightly limited on runtime. So, for 
example, a feed might fetch a file from another server in one job, process the 
kills until it runs out of time in the next job and if it runs out of time 
stores its progress and requests to be called again in another job. Once done, 
it marks its job as complete and ready to be run again.

Clearing out old cache files can be too slow for one job so could be broken 
into smaller jobs that clear out part of the cache each run.

The difficulty is all in the short time between page request and annoying the 
user, or even just before the page load times out and the server kills the 
thread.

Original comment by kovellia on 6 Dec 2011 at 4:01

GoogleCodeExporter commented 9 years ago

Original comment by kovellia on 6 Dec 2011 at 4:02

GoogleCodeExporter commented 9 years ago
I understand the time out issue, but I do have a small work around for the 
master cron job idea.

Have the index.php run a the cron via exec()

For api, have it process the APIs three (Variable via cp?) at a time using a 
timestamp to differentiate between what it needs to do and what it has done.

For IDFeed, have it process it one at a time, using the same timestamp method. 

Im just throwing my ideas into the hat here, and once I understand the mod 
system more, I will try to make a viable example.

Original comment by wyatt@podkill.net on 6 Dec 2011 at 4:29

GoogleCodeExporter commented 9 years ago
exec will probably not work on any installation that you do not have full 
control over ... and I wouldn't let it run on one that I did!

Otherwise, that's the sort of thing I was considering. The minimum work in each 
task. So one API fetch, and preferably splitting up the parts so you get the 
files (which could take a minute with a few seconds for each 100 kill block, 
and several fetched), then process.

IDFeed is easier as you can specify a smaller limit of kills returned.

Also keep in mind that you have to avoid race conditions without losing track 
of progress. So you flag that a task is being worked on, and when it is 
complete. A timeout limit can restart unfinished jobs.

NB. Your simple approach with one 'cron' job called all the time is probably 
better if your servers are powerful and you have complete control. I'm trying 
for something that works more widely and in more restrictive circumstances (and 
probably in circumstances where EDK won't run anyway ...)

Original comment by kovellia on 6 Dec 2011 at 5:04

GoogleCodeExporter commented 9 years ago
Yea, I see your point, but wouldnt breaking the api and feeds into a step by 
step process (similar to the data import on the install script) negate the need 
for a overpowered server?

Think about it. If we were able to have it all run as a background process 
(That takes everyting in small steps) and just use a trigger to start it, the 
options are limitless.

If you consider using exec, you could simply do something like...
exec('php runtime.php');
system('php runtime.php');  (This provides an output)

By defualt, exec runs the target as www-data, so if you have it set as 755 and 
owner is www-data, it should run without a hitch.    Also, the only time exec 
does not work is when php_safemode is enabled.

Original comment by wyatt@podkill.net on 6 Dec 2011 at 5:30

GoogleCodeExporter commented 9 years ago
Almost forgot...

You could also try exec('wget --delete-after 
http://kb.com/index.php?a=cronmod');

Original comment by wyatt@podkill.net on 6 Dec 2011 at 5:34