Closed baubie closed 11 months ago
Same here.
As a workaround, I noticed that enabling Ignore TLS/SSL error for HTTPS websites
does fix this issue and makes the monitor works as before.
Ops, that is strange. I thought I handled it. Need to investigate.
I thought https://github.com/louislam/uptime-kuma/pull/4011 was fixed issue, but clearly not.
Do you guys know how to use SQLite Expert to view data? I want to see what is inside your timeout
field.
Ops, that is strange. I thought I handled it. Need to investigate.
I thought #4011 was fixed issue, but clearly not.
Do you guys know how to use SQLite Expert to view data? I want to see what is inside your
timeout
field.
Same issue here, clean first time install. Screenshot:
I tried reproducing this by creating a new database in branch 1.21.X
, adding HTTP monitors with intervals 60
and 360
, then upgrading to 1.23.X
, but it still works fine as expected.
Maybe you can post the config of the monitor here (masking out anything sensitive), and the node.js version used (or docker).
Same here.
As a workaround, I noticed that enabling
Ignore TLS/SSL error for HTTPS websites
does fix this issue and makes the monitor works as before.
not for me :(
@chakflying noted in #4033 that
This may be a security fix in node.js v17+. Is the server running behind a corporate VPN/firewall?
(Related: https://github.com/nodejs/node/issues/45378, blog post)
Could the people affected please report:
I don't think #4033 is related to this since that has a specific error message.
Ops, that is strange. I thought I handled it. Need to investigate.
I thought #4011 was fixed issue, but clearly not.
Do you guys know how to use SQLite Expert to view data? I want to see what is inside your
timeout
field.
Thats a weird issue! So this only happened after upgrading to 1.23.4? and should i upgrade to 1.23.4? or should i wait?
After downgrading to 1.23.3 the monitors are up and running again.
is it safe to downgrade while keeping all data ? Is it just a matter of pulling 1.23.3 and recreating the container ? @JMyrng :)
After downgrading to 1.23.3 the monitors are up and running again.
is it safe to downgrade while keeping all data ? Is it just a matter of pulling 1.23.3 and recreating the container ? @JMyrng :)
Yes, I've done exactly that.
After downgrading to 1.23.3 the monitors are up and running again.
is it safe to downgrade while keeping all data ? Is it just a matter of pulling 1.23.3 and recreating the container ? @JMyrng :)
Yes, I've done exactly that.
thank you, I'm going to do the same as I keep receiving Cancelled notifications for no reason and my paused monitors get unpaused magically :/
Monitors unpausing is another issue that should be fixed with #4032.
Again, for those who have experienced this exact issue (HTTP monitor getting "Cancelled" when it's actually up), please post your node.js version and OS (or docker) so we can help troubleshoot. ππ»ββοΈ
I saw about #4032, good to know a fix is coming, thank you.
My versions: Docker 20.10.23 (API: 1.41) (running on Synology NAS) node.js 16.20.2
I'm not sure if I had exactly the same issue regarding the Cancelled message. In my case, I started a new install with 1.23.3, then upgraded to 1.23.4. I then created a new HTTPS monitor to check a website that was currently down, and I kept receiving notifications endlessly until the website was finally up. In the GUI, it was looping between pending (orange) and down (red).
It shows repetitive Down/cancelled events:
Just release a beta version for fixing this issue:
louislam/uptime-kuma:1.23.5-beta.0
Would be appreciated if you guys could help to test it.
Just release a beta version for fixing this issue:
louislam/uptime-kuma:1.23.5-beta.0
Would be appreciated if you guys could help to test it.
Just release a beta version for fixing this issue:
louislam/uptime-kuma:1.23.5-beta.0
Would be appreciated if you guys could help to test it.
You did not use 1.23.5-beta.0
Just release a beta version for fixing this issue:
louislam/uptime-kuma:1.23.5-beta.0
Would be appreciated if you guys could help to test it.
You did not use
1.23.5-beta.0
my bad, now i am using it:
That's strange. I just released 1.23.6. Free feel to try.
If it is still not working. Please change the tag to 1.23.3.
That's strange. I just released 1.23.6. Free feel to try.
If it is still not working. Please change the tag to 1.23.3.
Installed 1.23.6 but issue is still there. So reverting to 1.23.3.
Could it be related to IPv6? Pushover fails to send notifications:
Error: Error: Error: connect ETIMEDOUT 2606:4700:10::6814:2bec:443
at Pushover.throwGeneralAxiosError (/app/server/notification-providers/notification-provider.js:38:15)
at Pushover.send (/app/server/notification-providers/pushover.js:41:18)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Monitor.sendNotification (/app/server/model/monitor.js:1433:21)
at async beat (/app/server/model/monitor.js:964:21)
at async safeBeat (/app/server/model/monitor.js:1034:17)
Also there are services that are not affected by this strange "cancelled" bug. Both of them have only A Records.
Hit me up if there is anything else I could help you with!
@JMyrng Thank you so much.
I just created a new image for printing more info and +10 seconds for AbortSignal (#4053)
louislam/uptime-kuma:nightly
Now the log will output something like this:
2023-11-18T23:22:18+08:00 [DEBUG_TIMEOUT] INFO: monitor id: 58
2023-11-18T23:22:18+08:00 [DEBUG_TIMEOUT] INFO: timeout: 16
2023-11-18T23:22:18+08:00 [DEBUG_TIMEOUT] INFO: interval: 20
2023-11-18T23:22:18+08:00 [DEBUG_TIMEOUT] INFO: Using AbortSignal.timeout: 26000
The monitor id could be found in the browser address bar.
Would be appropriated if you could show the log here for the monitors that are having issues.
Would be appropriated if you could show the log here for the monitors that are having issues.
2023-11-18T16:34:05+01:00 [DEBUG_TIMEOUT] INFO: monitor id: 11
2023-11-18T16:34:05+01:00 [DEBUG_TIMEOUT] INFO: timeout: 5
2023-11-18T16:34:05+01:00 [DEBUG_TIMEOUT] INFO: interval: 60
2023-11-18T16:34:05+01:00 [DEBUG_TIMEOUT] INFO: Using AbortSignal.timeout: 15000
2023-11-18T16:34:10+01:00 [MONITOR] WARN: Monitor #11 'RAE Tunnel': Pending: timeout of 5000ms exceeded | Max retries: 1 | Retry: 1 | Retry Interval: 20 seconds | Type: json-query
Would be appropriated if you could show the log here for the monitors that are having issues.
2023-11-18T16:34:05+01:00 [DEBUG_TIMEOUT] INFO: monitor id: 11 2023-11-18T16:34:05+01:00 [DEBUG_TIMEOUT] INFO: timeout: 5 2023-11-18T16:34:05+01:00 [DEBUG_TIMEOUT] INFO: interval: 60 2023-11-18T16:34:05+01:00 [DEBUG_TIMEOUT] INFO: Using AbortSignal.timeout: 15000 2023-11-18T16:34:10+01:00 [MONITOR] WARN: Monitor #11 'RAE Tunnel': Pending: timeout of 5000ms exceeded | Max retries: 1 | Retry: 1 | Retry Interval: 20 seconds | Type: json-query
Wait, it seems that it is actually a real timeout that is longer than 5 seconds?
Could you show this graph here?
I'm on mobile right now, but the red/orange is the timespan while on nightly. Green is on old version.
Thanks for your help, which make me wondering, we might be looking into a wrong direction. It may be a Node.js 18 issue.
I will build two more docker images later.
louislam/uptime-kuma:nightly-node18-no-signal
louislam/uptime-kuma:nightly-node16-with-signal
louislam/uptime-kuma:nightly-node16-no-signal
Looking forward to the results. π
This is another layered cake situation as I managed to reproduce the "Retries getting reset" part of the problem.
1.23.X
, it does not have the accurate interval fix (#3072). This means that any blocking timeout would count towards the beat interval.60
, and timeout is set to 48
, part of that 48 seconds gets counted towards the beat interval.checkMonitors
would consider it as stuck.
[MONITOR_CHECKER] ERROR: Monitor Interval: 60 Monitor 1 lastStartBeatTime diff: 95
[MONITOR_CHECKER] ERROR: Unexpected error: Monitor 1 is struck for unknown reason
retries
gets reset, and notifications get sent again.I think we should either backport #3072, or increase the checkMonitors
tolerance to prevent this.
You are right, the pattern is similar to @JMyrng's graph, assume that he did not restart the monitor manually.
So i just have to restart the monitor? When i updated i completely removed the old & pulled out the new ver, i assumed that was enough, i'm using portainer tho
So i just have to restart the monitor? When i updated i completely removed the old & pulled out the new ver, i assumed that was enough, i'm using portainer tho
No, that comment was replied to @chakflying's comment.
You could try these images I mentioned in https://github.com/louislam/uptime-kuma/issues/4027#issuecomment-1817576022
So i just have to restart the monitor? When i updated i completely removed the old & pulled out the new ver, i assumed that was enough, i'm using portainer tho
No, that comment was replied to @chakflying's comment.
You could try these images I mentioned in #4027 (comment)
I'm starting to think it's my container's issue cause i went into portainer container, click on console and then pinged google.com and these were the results
- Node 18 no Signal Patch:
louislam/uptime-kuma:nightly-node18-no-signal
- Node 16 with Signal Patch:
louislam/uptime-kuma:nightly-node16-with-signal
- Node 16 no Signal Patch (This should be similar to 1.23.3):
louislam/uptime-kuma:nightly-node16-no-signal
Looking forward to the results. π
Has node 18 changed anything about the use of IPv6? My IPv4 only monitors are still not affected by this bug. I believe that IPv6 is not set up correctly in the container and the preferred variant has changed. I'll try to disable IPv6 later.
Starting from node v17, the DNS lookup of node would preserve the IPv4/IPv6 priority returned by the resolver (nodejs/node#39987), whereas before it would always prefer IPv4. This means that compared to before, some connections will switch to using IPv6 when available.
Starting from node v17, the DNS lookup of node would preserve the IPv4/IPv6 priority returned by the resolver (nodejs/node#39987), whereas before it would always prefer IPv4. This means that compared to before, some connections will switch to using v6 when available.
How can I set dns.lookup verbatim option to false to test this?
I'll try to disable IPv6 later.
Got it disabled in the docker network, everything is working now with 'latest' tag. So in my case it was a incorrectly configured ipv6 setup.
@JMyrng Thanks for testing and finally found the problem. I will mention this in the changelog, as I unlikely downgrade back to Node.js 16.
@chakflying For https://github.com/louislam/uptime-kuma/issues/4027#issuecomment-1817579493, feel free to continue in a new issue.
I have same issue, 1.23.6, sometimes got canceled status update from 1.23.3 to 1.23.6 node version 20.9.0 and then I check some of my web cert exp. not display
β οΈ Please verify that this bug has NOT been raised before.
π‘οΈ Security Policy
Description
HTTP monitors all showing "canceled" after update to 1.23.4 and it must be related to https://github.com/louislam/uptime-kuma/pull/3961#issuecomment-1804149144
When I checked, the timeout for these monitors were all set to 2880 seconds (I don't think I set this). I set them down to something more reasonable (a couple of seconds) and all monitors work again.
I didn't see any other posts about this and it may just affect me as I've been running Uptime Kuma for a long time now.
FIX: Update the timeout of your HTTP monitors to a new value.
π Reproduction steps
Have old monitors and upgrade to 1.23.4 (from 1.23.2 in my case).
π Expected behavior
Monitors will report being up.
π Actual Behavior
Monitors reported down with the status "canceled".
π» Uptime-Kuma Version
1.23.4
π» Operating System and Arch
UBuntu 22.04
π Browser
N/A
π Docker Version
No response
π© NodeJS Version
No response
π Relevant log output
No response