ioBroker / ioBroker.javascript

Script engine for JavaScript and Blockly
MIT License
322 stars 119 forks source link

Cron triggered twice within a minute (:00 and :59) #1196

Closed ScumbagSteve closed 1 year ago

ScumbagSteve commented 1 year ago

I set up a Blockly to start a script each day midnight at 00:05 o'clock. The script then calculates how often my heater turned on within a day and saves this value in a object. The "new old" value is saved in another object, as a reference for the next run one day later.

I noticed I often have two values per day: One with the actual difference from the day, and then another value 59 seconds later with 0 difference (which is logic, because in these 59 seconds the heater doesn't turn on). grafik

Setup Popup: grafik

Script: grafik

JavaScript: grafik

I can see the message twice in the log (00:05 and 00:06 o'clock) so I can be sure it's the same script that is triggering this. Any idea why this could be triggered twice?

Current stable JavaScript installed.

lxffm commented 1 year ago

Same for me while calculating my daily power consumption. First I thought it would be related to my cron exactly at 0:00. But the error moved with me when I changed it to 0:03. So I suffer from exactly the same.

I had also a look at the corresponding Javascript and that is also unsuspicious ... look here

// Daily Schedule
schedule('{"time":{"exactTime":true,"start":"00:00"},"period":{"days":1}}', async function () {
  console.log(('Starting calculating last days power consumption!'));
  setState("0_userdata.0.Energy_Data.power_total_yesterday"/*power_total_yesterday*/, getState("0_userdata.0.Energy_Data.power_total_today").val);
  console.log((['Write power_total_today: ',getState("0_userdata.0.Energy_Data.power_total_today").val,' to power_total_yesterday: ',getState("0_userdata.0.Energy_Data.power_total_yesterday").val].join('')));
  setState("0_userdata.0.Energy_Data.power_total_today"/*power_total_today*/, getState("smartmeter.0.1-0:1_8_0__255.value").val);
  console.log(('Write current total power count to power_total_today: ' + String(getState("0_userdata.0.Energy_Data.power_total_today").val)));
  setState("0_userdata.0.Energy_Data.power_consumption_last_day"/*power_consumption_last_day*/, (parseFloat(getState("0_userdata.0.Energy_Data.power_total_today").val) - getState("0_userdata.0.Energy_Data.power_total_yesterday").val));
  console.log((['Calculate and write power_consumption_last_day: ',getState("0_userdata.0.Energy_Data.power_consumption_last_day").val,' by power_total_today ',getState("0_userdata.0.Energy_Data.power_total_today").val,' - ','power_total_yesterday ',getState("0_userdata.0.Energy_Data.power_total_yesterday").val].join('')));
});

What is actually weird, it is not consistently every time. There is no pattern I could scout, but sometimes it happens three times in a row, then it won't for one or two days and that again in another cadence.

ScumbagSteve commented 1 year ago

What is actually weird, it is not consistently every time. There is no pattern I could scout, but sometimes it happens three times in a row, then it won't for one or two days and that again in another cadence.

For me the last errors appeared on:

  1. 24.11.22
  2. 25.11.22
  3. 03.12.22
  4. 07.12.22
  5. 08.12.22

Can you check if you had issues on the same days? Would be interesting to see if the dates correlate, or if it's different for each person.

Do you use influxdb? I don't know if it's possible, but lately I saw influxdb as source in my history, which is weird because I only use it for saving the values. I don't pull anything from there. Maybe it has something to do with the acknowledgement of the value, when it's written in the database. grafik

Apollon77 commented 1 year ago

Influxdb as "source" in the data logging is because you have defined something loke "log the same value after timeX again if unchanged" or such ... then this is to identify such "re-loggings"

lxffm commented 1 year ago

@ScumbagSteve for me it's history adapter who is storing the values, thus I assume it's not influxdb issue.

For matching our dates, I have deleted the wrong values from history.db at your given dates already. But I can show you the last couple of days, where it happened fairly randomly again. Look at the screenshot.

Screenshot 2022-12-21 at 12 05 35
Apollon77 commented 1 year ago

Can you please check if at tht times where that happened there is something in /var/log/syslog or /var/log/messages or /var/log/daemon.log about "ntp syncing the time" ... I just want to make sure that the reason is not the clock of your system which is re-set or such?

lxffm commented 1 year ago

@Apollon77 I had a look into the mentioned logs and couldn't find anything suspicious, at least from my layman point of view. And there was no NTP related message at all.

Can't either imagine that this is due to NTP sync, as I was changing the cron to 0:05 as well and the script was retriggered at 0:06 in this case. So it's always triggered one minute later, no matter what the actual schedule is.

Apollon77 commented 1 year ago

Puuhh ... that will be an interetsing one :-) (and i fact the ntp stuff could have happened then in your case on thw "whole day".

The scheduler in Javascript works basically the following way: When it triggered it calculates the "time till next trigger" and set a timer for that.

Versionvise I assume you have 6.1.4 installed?

ScumbagSteve commented 1 year ago

@lxffm

For matching our dates, I have deleted the wrong values from history.db at your given dates already. But I can show you the last couple of days, where it happened fairly randomly again. Look at the screenshot.

Screenshot 2022-12-21 at 12 05 35

It doesn't match 100% of yours but it's about the same. You had: 14.12., 15.12., 18.12., 21.12. I have 14.12., 15.12., 18.12., 19.12. After that date I disabled the script and replaced it by the statistics adapter.

@Apollon77

Versionvise I assume you have 6.1.4 installed?

Yes correct. 6.1.4.

lxffm commented 1 year ago

@ScumbagSteve That tip about the statistics adapter was awesome. I'm evaluating it now too as it matches my purposes.

Anyways, the adapter issue should be fixed as it can harm other use cases/scripts as well. Thus, I'm letting my scripts run for the time being, in order to verify fixes if needed.

mcm1957 commented 1 year ago

@Apollon77 Don't know if I looked at the corret place. But I think the the following sequence could be trigegred twice if the minute changes between first and second call to Date(): (Lines 93 and 98)

https://github.com/ioBroker/ioBroker.javascript/blob/a83f0a2db4cabf75aa48622cb68815b794e6d8ff/lib/scheduler.js#L93

const d = new this.Date();
d.setMilliseconds(0);
d.setSeconds(0);
d.setMinutes(d.getMinutes() + 1);
this.timer = setTimeout(() =>
       this.checkSchedules(), d.getTime() - this.Date.now());

If I'm writing nonsense, please ignore and excuse. I'm ruther new at js.

Apollon77 commented 1 year ago

Thank you for the check on it ...

In fact I also have the idea that the issue relies there ... but how is interesting ...

I more suspect the following: We know that timers are not 100% accurate in nodejs, but normally the timer are executed a bit "too late" then too early ... was my hope. But in the end it is all precision of the RTC.

I more assume that (why ever) the calculation of the time to wait is "some milliseconds too low" or the timer tiggered " a millisecodn too early" and so the checkschedule is executed a second time "for the same minute" (just "at the veeery end of the second). And the precision is the issue here bacause: the precision depends on your underlying operating system, but is typically between 10 and 50 milliseconds. ;-)

From the screens above we can see that the timer got executed "<60s" later mostly or veeery near after the second :-)

So I suspect that the rtc of the affected systems is "bad" or "unaccurate" and so we have such an issue

My ideas are: 1.) Hacky: just not set 0MS for the target date but 20 ... SO we accept a small unaccuracy but get a better "hope" for coming in the correct next minute 2.) Add a logic to also remember e.g. which "minute" (or "not before timestamp) we expect to end in and check this after the setTimeout and pot. do another setTimeout if too early,so log till we are in the correct minute (or better "no longer in the current one")

Additionally I would move the "now" that is used in checkSchedule out to a parameter so use the same "now" for all checks and not always calculate it new

Apollon77 commented 1 year ago

@ScumbagSteve @lxffm Can anyone of you give the GitHub version a try? I implemented something that should prevent the issue of not that prtcise RTCs

ScumbagSteve commented 1 year ago

@ScumbagSteve @lxffm Can anyone of you give the GitHub version a try? I implemented something that should prevent the issue of not that prtcise RTCs

I created a script with same actions (I can't use the original script right now as the linked device is offline) and wait until I can observe the same wrong behavior again. When I am sure the error happens on the new script as well, I will install git version of JS and check if it's fixed.

Apollon77 commented 1 year ago

thank you ... I hope we do not need to wasit toooo long :-)

ScumbagSteve commented 1 year ago

thank you ... I hope we do not need to wasit toooo long :-)

I was now able to observe the bug on my newly created script and have therefore upgraded to the git version of JS (v6.1.4). If I don't get the error in a whole week, I guess it can be considered fixed. Will send an update, in case I encounter the bug again with latest version.

I have also created a script that triggers on a minute-basis, to accelerate testing. Interestingly on the minute basis the issue didn't happen in 4 days, while on the "daily" trigger it happened after 3 days.

Do you have a logical explanation why this error only seems to happen for daily crons?

Apollon77 commented 1 year ago

Do you have a logical explanation why this error only seems to happen for daily crons?

maybe try a 10 minute or hourly interval because minute is the normal interval the javascript adapter is checking these kind of schedules ... and it is "short timeframes", so "less chance to get "aus dem tritt" maybe?

lxffm commented 1 year ago

of course I can check it too. But can you tell how I can install the mentioned version from GIT?

Apollon77 commented 1 year ago

GitHib Install: Admin - enable expert mode - adapter - click the custom install button (Github icon) - choose "from Github" tab (second one), select Javascript and install

ScumbagSteve commented 1 year ago

I would consider it as fixed. Have not seen another issue so far.

Apollon77 commented 1 year ago

cool, I have soem other stuff todo, so releae need some days ... but cool and thank you for the verification

lxffm commented 1 year ago

I can confirm as well, since about 5 days I don't have any anomalies in the data. Seems like it's gone.