Closed robocoder closed 14 years ago
Attachment: piwik-dev1 (#1184).patch
See also #587 which could allow triggering these cron tab like tasks from piwik.php requests in case users don't setup automatic crontabs.
If automatic crontab is setup (which can be automatically detected by Piwik), then cron tabs tasks are not triggered by piwik.php (see #587)
I believe we should update the documentation and have the crontab fire more regularly, ie. every 15 minutes, in case some plugins need to run tasks more frequently. The standard archiving task would only trigger after config.ini.php > time_before_today_archive_considered_outdated seconds.
We need to think about the current archive.sh script and how it would be changed to accomodate this new hook (either call this plugin specifically, or change the way archive.sh work to make it call this plugin that would trigger archiving?). Note that it might be better to leave archive.sh with the current "looping over websites and periods" to archive them separately because otherwise, triggering all archives at once will result in memory issues for Piwik installs with hundreds/thousands websites.
Also, do we need system to enforce that such task can not be ran twice at the same time (a software (or DB?) level lock mechanism).
Sending email reports is also candidate for this hook, see for example PDF plugin #71
Implementation proposal
// pseudo code of function hooking on runTasks
function runOptimizeTables($notification)
{
// run every Mondays at 2AM
if( TaskScheduler.shouldRunTask( 'my task ID name', 'weekly' ))
{
// execute task
}
}
Note that we don't have minutes, because smaller possible granularity is the hour. (cron tabs are setup to run once per hour and probably should never run more often)
The difference between running scheduled tasks via cron or via piwik.php is that, it might be triggered more than once per hour (even though all requests to piwik.php will not trigger the Scheduled tasks, for obvious optimization reasons, only one random out of many will trigger scheduled tasks).
A solution to this issue is to plan for schedules ahead of time (process the time at which the task will run next). Then, when the task successfully runs, re-schedule it for next time (eg. next week for a weekly task)
pseudo code
function shouldTaskRun( taskID, interval, [ minimumTimestamp ] )
if(minimumTimestamp > time()) return false;
schedule = Piwik_GetOption('schedule')
shouldRunTask = false;
if(isset(schedule[taskID]))
{
// task already scheduled, run only if scheduled_time is > time()
if(schedule[taskID]['scheduled_time'] > time())
{
shouldRunTask = true;
}
}
else
{
// new task, always run once first time cron is ran
shouldRunTask = true;
}
// process next time at which should run
nextScheduleTime = time() + (if hourly then 3600 elseif daily then 86400 etc.);
schedule[taskID][scheduled_time] = nextScheduleTime;
// record updated schedule in DB
Piwik_SetOption('schedule', schedule);
return shouldRunTask;
minimumTimestamp can be used to define exactly what time of day should tasks run.
For example, if one wants to run a daily job at 2AM, you would write in your plugin
if( TaskScheduler.shouldRunTask( 'my task ID name', 'weekly', mktime(2,0,0,date('m'),date('d'),date('Y')) ))
What will happen is that, the first time the cron triggers after 2AM, this scheduled task will be allowed to run. ShouldRunTask will then process next time it should run, which is 2AM the next day.
Edge case: if the cron didn't run before 5AM (for some reasons), it will trigger the 2Am task. However you wouldnt want to schedule tomorrow's task at 5AM but at 2AM. You can use code such as
now = time();
interval = 86400; // for example
nextScheduleTime = now + interval - ((now - minimumTimestamp) % $interval);
let me know if this makes sense, cheers
Note: inspired from WP implementation see http://phpxref.ftwr.co.uk/wordpress/nav.html?wp-includes/cron.php.html#wp_schedule_event
http://phpxref.ftwr.co.uk/wordpress/nav.html?wp-cron.php.html
while their implementation is over complicated, we can do the same thing in a few lines of code :)
I'm ok with the proposal except for one bit.
I would like the implementation to be more object oriented.
There would be a Piwik_ScheduledTask, a Piwik_ScheduledTime.
Instead of having :
function getListHooksRegistered()
{
return array(
'TaskScheduler.getScheduledTasks' => 'runOptimizeTables',
);
}
function runOptimizeTables($notification)
{
// run every Mondays at 2AM
if( TaskScheduler.shouldRunTask( 'my task ID name', 'weekly' ))
{
// execute task
}
}
it would be
function getListHooksRegistered()
{
return array(
'TaskScheduler.getScheduledTasks' => 'getScheduledTasks',
);
}
function getScheduledTasks($notification)
{
$scheduledTasks = &$notification->getNotificationObject();
$tableOptimisationScheduledTime = Piwik_ScheduledTime::factory('weekly');
$tableOptimisationScheduledTime->setDay('monday');
$tableOptimisationScheduledTime->setHour(13);
$tableOptimisationScheduledTime->setMinute(20);
$scheduledTasks[] = new Piwik_ScheduledTask('runOptimizeTables', $tableOptimisationScheduledTime);
}
function runOptimizeTables()
{
// execute task
}
proposal looks good to me!
I have submitted a patch in which I decided to remove all modulo calculus in favor of easier to read and easier to maintain computations.
(In [2648]) Fixes #1184 Great patch by Julien Moumne to add Scheduled Task API in Piwik
(In [2697]) Refs #5491
(In [2737]) Refs #5491
is there a possibility to schedule PDF reports without the usage of the crontab? For me it would be nice to run reports e.g. when somebody logs in because I do not have the possibility to create crontabs.
Beatgarantie, scheduled reports should work without crontab in 0.7. Requests to the Tracker will trigger scheduled tasks hourly. See #587 - let me know if it works for you
@matt: OK, I will test.
It would the nice to see the PDF-template after switching to another tracked page via the website-dropdown.
Use one crontab entry to trigger Piwik archiving, daily report generation, bots, etc.
This plugin:
Updates the UI Settings 'general settings'
This plugin is not #817.