catalyst / moodle-tool_cloudmetrics

Realtime metrics for Moodle
https://moodle.org/plugins/tool_cloudmetrics
GNU General Public License v3.0
0 stars 3 forks source link

Issue #73: Use the starting time of the cron process instead of the task #83

Closed jaypha closed 2 years ago

jaypha commented 2 years ago

Resolves #73

Add mtrace temporarily for debugging.

brendanheywood commented 2 years ago

I think each metric should be passed in the window for when the metric should have been for

Ie more or less round down to the interval it should be rather than exactly what it is. I do not think we should reach into the cron managers internals

jaypha commented 2 years ago

Wouldn't that just artificially force it to match? If the current time is 12:31, how do we know if it is the 12:31 cron run or the 12:30 cron run?

brendanheywood commented 2 years ago

From an implementation angle, we have some metrics which can only ever be 'now' and then we have some other that we can retrospectively work out based on things like logs. From the interface's point of view we don't quite know the difference, and I'm on the fence around whether it should know or not. No matter what when cron runs there will always be some delay, it could be seconds or it could be a minute or more. So I think when the manager asks a metric for a value it should always provide the start and end timestamps for the period it should be measuring, and the metric_item it returns should reflect this. If it is a 'now' metric it can just ignore the time period as it can't use it. But if it is something that say reads the logs then it can use it exactly, and it also means it could be used to backfill data. So there is some overlap with the work @marcghaly is doing.

$metric->get_metric_item($start, $finish);

Both timestamps will always be in the past no matter what. If cron is really laggy then we should detect this and skip some as needed. eg the metric could have internal logic like 'I am being asked for N-10 to N and it is currently N+1 and this is ok, but if it was N+5 we are closer to the next one so. That could be shared in the base class.

But lag of 1 minute when we are measuring a metric every 5 minutes (the real example here) or every hour or every 24 hours is totally fine and in these cases we want to store the nice rounded stamp of when it should have ideally run not the exact timestamp. Especially if we are graphing multiple metrics side by side we want them to all align on the same time period boundaries.

If we are measuring a metric every minute and cron is on average lagging more than that then we have a more important problem to fix. This is more or less this issue where I want to document how to set up a highly accurate crond setup for when this is needed:

https://github.com/catalyst/moodle-tool_cloudmetrics/issues/26

ie move the metric task to it's own dedicated crond entry so it isn't contending with anything else. We could also add a check for this (rainy day task, not important now as will only ship with metrics which default to 5 mins)

https://github.com/catalyst/moodle-tool_cloudmetrics/issues/86

jaypha commented 2 years ago

This PR is now a significant refactoring.

The function get_metric_item() for metric classes is now called generate_metric_items($starttime, $finishtime). There is a new config for storing the latest generate. The collect metrics task has been refactored to use this config value. It should no longer be dependent on the current time being exact.

jaypha commented 2 years ago

I've reseparated the generate functions and added a can_generate_past_metric_items() function to the metric base. I don't want to go too deep into supporting backfilling as it is not really in the scope of this PR. I added a second mtrace for printing info about items, so that if a crash occurs during generate_metric_items, we at least have a clue as to what it was trying to do.