Feature request: cycle offset

onkelandy commented 3 years ago

For some situation it might be useful to offset a cycle - e.g. to avoid triggering loads of telegrams at the same time. Therefore the possibility to offset a cycle of items and logics would be nice.

This could be implemented like this https://github.com/smarthomeNG/smarthome/pull/359/commits/69864328b69291040c319fae0361fc616bfada93 or maybe as an additional parameter of the cycle attribute itself, separated by a comma?

msinn commented 3 years ago

Is this really needed? The scheduler randomizes those cycle tasks. Have a look at the parameter description of scheduler.add:

        :param offset: an optional offset for cycle. If not given, cycle start point will be varied between 10..15 seconds to prevent too many scheduler entries with the same starting times

When an offset is given, the randomization is disabled. That increases the chance of "task storms" when configuration is not done very thoroughly.

onkelandy commented 3 years ago

Not totally sure how that automatic offset works.. in the admin interface I have a "ton" of entries in "schedulers" that trigger at exactly the same time. Looking at the log also shows that those cycle items are triggered at exactly the same time. Which actually is fine I think but then some manual control might also be helpful. Also referring to this post: https://knx-user-forum.de/forum/supportforen/smarthome-py/1303071-stateengine-plugin-support?p=1509498#post1509498

msinn commented 3 years ago

I don't understand completely: DO we have a problem or MIGHT there be a problem?

Morg42 commented 3 years ago

(didn't see msinn's hint towards offset parameter) Isn't this what was asked for?

msinn commented 3 years ago

It‘s in the source code in lib.scheduler. OnkelAndy wants an item attribute to specify that offset for items with a cycle attribute.

Morg42 commented 3 years ago

Check. Item-individual.

So I would expect this to be enough if the scheduler doesn't trigger more than events at the same time, but offsets the initial cycle randomly by itself. Possibly at one second intervals for single items might be enough?

This shouldn't be too hard to implement - I guess a simple conditional delay in lib.schedule inside the while-loop (l. 252) might be enough. Save "trigger time" and make sure the next trigger doesn't fire earlier than 'trigger time + 1s [ + 1s * rnd(1)]'. As I understand the "scheduler loop" to be single threaded in the run-Method of lib.scheduler this wouldn't even need synchronizing with other threads.

The offsetting itself, the base offset time as well as a possible random addition might all be set as parameters, so this could be adjusted per shng-instance, possible even at runtime.

msinn commented 3 years ago

This shouldn't be too hard to implement

There is an implementation for that in the code. It has been there for ages, but it might have an implementation error that needs to be fixed. I am going to look into that.

I am no fan of a human configuration, if it can be done without human intervention. An offset configuration makes things look mote complicated for the user.

And: I am no fan of a separate attribute for something like this (CYCLE_OFFSET), since it is only an additional parameter for an existing attribute (CYCLE), which makes no sense on its own.

msinn commented 3 years ago

As far as my understanding of the scheduler code progressed, the implementation only adds the random offest to the second execution cycle and then the random offset is lost and all following execution cycles happen at the same time...

Morg42 commented 3 years ago

Then we misunderstood.

I understand that Andy wanted to spread item cycle execution. I haven't found the offset code, at least not where I was expecting it, will also look again.

My idea was to implement a "forced" spread where the scheduler forces itself not to trigger two executions immediately one after the other, but waits some time (e.g. 1 s - possibly plus 0-1s random component) before executing the next.

(edit) maybe to stress this further - the existing offset spreads the "planned execution time" of cycle jobs, which is (generally) fine. My idea works on "scheduler loop execution time" and does a further spread, if two schedules jobs happen to be planned to be triggered at the same time, regardless of why this happened - planned without offset, planned at different time and the offset nullified the time difference, whatever. (end edit)

This would be a scheduler attribute, not an item attribute and would apply independent of any item configuration. By this the execution of two items' cycle functions would be time-spread.

The default would be implemented and "just work" without any need to tamper, interfere or even influence. For those who want, there might be the possibility to change the scheduler attribute, maybe by accessing the scheduler object and directly adjusting properties or calling designated methods. This is not "userland experience", but I think this can be discussed here.

Will take another look at the code for the current implementation of "offset", and either try a fix if this promises to solve Andys problem, or implement the mentioned modification, both as basis for further discussion.

msinn commented 3 years ago

My idea was to implement a "forced" spread where the scheduler forces itself not to trigger two executions immediately one after the other, but waits some time (e.g. 1 s - possibly plus 0-1s random component) before executing the next.

That's what is implemented (but maybe with an error) Look at the add methomd.in lib.scheduler starting line 387.

My idea works on "scheduler loop execution time" and does a further spread

But why? When the first execution cycle is spread in the right way, all following cycle executions will be spread exactly in the same way.

The default would be implemented and "just work" without any need to tamper

That was what mknx probably had in mind when he implemented the scheduler and that is why I am no friend of OnkelAndys proposal.

Edit: The randomization is done in line 449/450:

        if cycle is not None and offset is None:  # spread cycle jobs
                offset = random.randint(10, 15)

msinn commented 3 years ago

@onkelandy I think you made this proposal to keep the number of scheduler worker threads low. Am I right? But the number will go only up 1 per minute until a balance is reached and there is not large queue of waiting tasks any more.

Do you still have a problem with the rising number of worker Threads? How many cycle Items do you have that start at the same time and have the same loop time? (Items with different loop times will spread out with the second cycle).

Morg42 commented 3 years ago

My idea was to implement a "forced" spread where the scheduler forces itself not to trigger two executions immediately one after the other, but waits some time (e.g. 1 s - possibly plus 0-1s random component) before executing the next.

That's what is implemented (but maybe with an error) Look at the add methomd.in lib.scheduler starting line 387.

No. The scheduler "spreads" the tasked (first) execution time, so adding two jobs "at the same time" should not end up in both jobs being scheduled for the same time.

But add one job (cycle=30), which is scheduled with an offset of 15. 5 seconds later, add another job (cycle=30), which is scheduled with an offset of 10 (by design or random)

Both jobs end up being scheduled at the same time for ever.

My idea is not based on "schedule time spread", but on "run time spread" or "run time job delay", if you will. No matter why two jobs end up being executed at the same time, they will be delayed by some time.

I'm having some problems with github right now, because my old PR is still open, and the new PR included the old changes.

So, scheduler.py, lines 259 ff. 'new':

                if dt < now:  # run it

# --> new
                    # determine if 'offset' run spread time has elapsed before next run
                    secs_since_last_run = time.time() - self._last_run_time
                    if secs_since_last_run < self._run_offset:
                        time.sleep(self._run_offset - secs_since_last_run)
# <-- end new

                    self._runc.acquire()
                    self._runq.insert(prio, (name, obj, by, source, dest, value))
                    self._runc.notify()
                    self._runc.release()

# --> added line:
                    self._last_run_time = time.time()

But why? When the first execution cycle is spread in the right way, all following cycle executions will be spread exactly in the same way.

No, some other totally unrelated job might end up being scheduled at the same time

msinn commented 3 years ago

Both jobs end up being scheduled at the same time for ever.

That should no be a problem, because is a low number and that is handled by the number of active workers and the queueing. The poetntial problem arises on startup, when all items will start cycling at the same time.

And think: How many cycle Items do you need to get into problems. Hundreds?

msinn commented 3 years ago

First: I want to know if there really is a problem and how it manifests itself. As much as I know, nobody beside OnkelAndy hat a problem.

Second: Modifications of the scheduler have lead to all sorts of (hidden) problems. So it is not happening for the Release 1.8. (or we have to postpone the release for at least a month)

Morg42 commented 3 years ago

I don't think this can be a 1.8 candidate anyway. Tampering with something as central as the scheduler will definitely need intensive testing, maybe in develop for at least one full release cycle.

I don't know about potential problems, maybe my installation is too "small" to exhibit problems. And my cycle jobs are mostly plugin-related, so on one cycle trigger it takes ~10 seconds for everything to happen, so this is not much concurrency.

My idea was based on the discussed 'problem' of not running lots of cycles at the same time. I'm not taking sides as to what is necessary and if the discussed problem manifests itself widely.

I'm not sure what Andys effective problem really is - the original text mentions 'telegrams', so this might be some output of a plugin? The referenced forum post shows an item configuration with "delays". As cycle runs are handled by individual threads, this might even be solved by sleep-ing the delay time inside the cycle thread.

I agree with @msinn that this doesn't seem to be a problem which should be solved on item level.

msinn commented 3 years ago

this might even be solved by sleep-ing the delay time inside the cycle thread.

No, that would extend the time the thread is occupied and not returned to the pool of idle worker threads. That would create a problem.

onkelandy commented 3 years ago

Actually with telegrams I was talking about the knx plugin. Ivan had the problem with a lot of devices being triggered at the same time and increasing the knx bus load significantly: https://knx-user-forum.de/forum/supportforen/smarthome-py/1303071-stateengine-plugin-support?p=1505485#post1505485

After upgrading from a Pi 3 to 4 I can live with the current situation for sure right now as there is no big problem. With slow hardware it's different.

if the previously mentioned randomization works I guess we are fine..

bmxp commented 3 years ago

It can be a solution to use a crontab instead of cycle. Since the whole crontab thing is reworked in develop it is possible to step down into seconds for crontab time scheduling. So if there is a need for finetuning it can be done there. This however implies that plugins provide a setting for a crontab as well as a cycle. It's up to the user what to then select either cycle or crontab. Currently I am working on the DLMS plugin to get a more precise time of query. This works now for a couple of weeks and if I find some time I will push an update of this plugin.

onkelandy commented 3 years ago

Sounds good. I actually just need to trigger an item a specific times/cycles - so I could test the crontab with the latest develop, right? In which file does the magic happen? The crontab should look like this to have an item updated every minute with 1 second delay, right? crontab: 1

bmxp commented 3 years ago

New docu is at https://www.smarthomeng.de/dev/user/referenz/items/standard_attribute/crontab.html If seconds are needed you'll have to give 6 parameters like crontab: <Sekunde> <Minute> <Stunde> <Tag> <Monat> <Wochentag>

onkelandy commented 2 years ago

crontab feature is tested and appropriate for the situation described above. Issue solved. Thanks!

smarthomeNG / smarthome

Feature request: cycle offset #370