dgreif / ring

Unofficial packages for Ring Doorbells, Cameras, Alarm System, and Smart Lighting
MIT License
1.24k stars 165 forks source link

HomeBridge Ring failing multiple times a day #1502

Closed markcarroll closed 1 week ago

markcarroll commented 2 months ago

Is there an existing issue for this?

Describe The Bug

Getting "Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds..." very frequently. It makes HomeBridge unusable (maxes CPU and memory) until I reboot the container. The HomeBridge docker container is running Node 20.

Note that rebooting doesn't actually fix the Ring plugin for long. I can change alarm state etc briefly after a reboot, but a couple of hours later it is broken again. Turning the alarm on or off will work immediately after a reboot, but given a couple of hours, any interaction with Ring will cause the error. It seems to put Homebridge into 9x% CPU and fills up memory so a full container reboot is needed.

To Reproduce

Just leave HB running. Container logs will show list of errors.

Expected behavior

It works like it used to.

Relevant log output

[10/1/2024, 10:24:04 AM] [homebridge-ring] Reconnecting location socket.io connection
[10/1/2024, 10:24:05 AM] [homebridge-ring] Creating location socket.io connection - Home
[10/1/2024, 10:24:06 AM] [homebridge-ring] Ring connected to socket.io server
[10/1/2024, 10:30:05 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 10:36:47 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:07:45 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:09:01 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:09:22 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:16:22 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:18:30 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:18:37 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:20:35 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:22:55 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  The operation was aborted due to timeout.  Trying again in 5 seconds...
[10/1/2024, 11:31:58 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:32:32 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:36:12 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  The operation was aborted due to timeout.  Trying again in 5 seconds...
[10/1/2024, 11:39:32 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:44:57 AM] [homebridge-ring] Front Door Detected Motion. Loading snapshot before sending event to HomeKit
[10/1/2024, 11:47:35 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:48:45 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 11:50:07 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  The operation was aborted due to timeout.  Trying again in 5 seconds...
[10/1/2024, 11:56:03 AM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 12:04:28 PM] [homebridge-ring] Garden Detected Motion. Loading snapshot before sending event to HomeKit
[10/1/2024, 12:04:58 PM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 12:05:20 PM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 12:06:43 PM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 12:23:07 PM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  The operation was aborted due to timeout.  Trying again in 5 seconds...
[10/1/2024, 12:26:08 PM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 12:28:12 PM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  The operation was aborted due to timeout.  Trying again in 5 seconds...
[10/1/2024, 12:29:20 PM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  fetch failed.  Trying again in 5 seconds...
[10/1/2024, 12:32:38 PM] [homebridge-ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices.  The operation was aborted due to timeout.  Trying again in 5 seconds...
[10/1/2024, 12:34:02 PM] [homebridge-ring] Front Door Detected Motion. Loading snapshot before sending event to HomeKit

Screenshots

No response

Homebridge Ring Config

{
    "refreshToken": "<...snip...>",
    "unbridgeCameras": true,
    "hideDoorbellSwitch": false,
    "hideCameraSirenSwitch": true,
    "hideInHomeDoorbellSwitch": false,
    "nightModeBypassFor": "some",
    "beamDurationSeconds": 60,
    "_bridge": {
        "username": "0E:13:5E:16:DB:00",
        "port": 57749
    },
    "platform": "Ring"
}

Additional context

homebridge-ring.log.txt

OS

Ubuntu Jammy (22.04.4 LTS)

Node.js Version

v20.17.0

NPM Version

?

Homebridge/HOOBs Version

Homebridge v1.8.4 · UI v4.57.1

Homebridge Ring Plugin Version

v13.1

Operating System

Docker

tsightler commented 2 months ago

Hi, thanks for opening and issue and providing logs. The logs clearly show errors with the network connectivity. For the most part, having some errors polling the Ring API is normal, and not really indicative of a problem per-se, Ring API servers are sometimes slow to respond and the requests timeout but will just try again later, so having some of these messages is "normal", for example I see anywhere from a dozen, to 40-50 such messages in a day in my own installation. Certainly you seem to be having a higher than normal failure rate, but these would not cause the plugin to fail to function, unless they were failing every single time, and, even then, that would not likely impact alarm functions, which use a websocket for communications.

It's important to note that the plugin is not somehow doing something super special here, it is calling the fetch API built in to NodeJS and asking it to retrieve the URL. The fetch API either returns the data, or an error if it can't retrieve the data. If the fetch API fails to retrieve the data, there's nothing for the code to do except report the error and try again.

However, there is another interesting tidbit in the log, specifically this part:

[10/1/2024, 10:24:04 AM] [homebridge-ring] Reconnecting location socket.io connection
[10/1/2024, 10:24:05 AM] [homebridge-ring] Creating location socket.io connection - Home
[10/1/2024, 10:24:06 AM] [homebridge-ring] Ring connected to socket.io server

So this is the websocket connection I mentioned above, all communications with Ring Alarm uses this websocket, not the HTTP requests. However, this log entry indicates that the websocket lost connection and had to be re-established. Again, this happening every now and then is no big deal, but if you are seeing this often, it's just another indicator that something is happening at the network layer. Note that it usually takes lost connectivity of longer than 60 seconds before the websocket will detect missed connectivity and restart, so this indicates significant network issues.

In summary, you have fetch API failing often, which indicates network issues, and websocket API disconnected, which indicate network issues. You will need to try to troubleshoot what network connectivity issue you are having. I would check the system logs on the host running Docker as a place to start, especially of this host is running in a VM based on KVM hypervisor (something like Proxmox, etc) as that recently had a known issues that was causes lots of dropped packets and created a lot of similar networking issues for users of ring-mqtt, which uses the same API. Alternatively, you can capture a packet dump with some tool like tcpdump or Wireshark.

tsightler commented 2 months ago

Another suggestion, make sure you have a local, caching DNS server. I've been surprised to find some users don't configure a local DNS, but just point all their client to some upstream DNS server, however, ring-client-api creates a LOT of DNS queries, and I've seen cases where this can flood the uplink with too much bursty traffic and some requests will fail. Usually this is for cases with DSL or some other asymmetric network where the upstream bandwidth is significantly lower than downstream.

markcarroll commented 2 months ago

Thanks for the quick response. I have a local DNS and all good there. I get that there may be occasional fetch misses, but this was not a problem on previous versions of the plug-in (pre 13, although I think 11.x was the best). Secondly the biggest concern I have here is that there appears to be a leak of some kind as the CPU and memory climbs very rapidly to the point the server is unresponsive. Restarting the Ring child bridge fixes it so it is clearly the culprit.

I will look into the network issue, but I do not believe that is the problem here, recovering from it seems to be.

tsightler commented 2 months ago

Previous versions of the plugin didn't use fetch, because it didn't exist in NodeJS until v18, rather there were dependencies on other HTTP request libraries, such as got, of which there have been dozens, if not hundreds of similar issues reported over the years.

However, comments such as "I think 11.x was the best", kind of show that it's not related to this because guess what, the code for the alarm portion is nearly 100% unchanged in 13.x version. While we have bumped major versions, due to required breaking changes in ring-client-api, on which many other projects depend. the actual changes to anything alarm related have been very, very small. Even for cameras, the only real changes are with the streaming API being used (Ring deprecated the old one) and the switch to new style FCM API for push notifications (cameras and intercoms only).

In other words, while you perceive that somehow 11.x was "the best", the actual core that does things like request API or maintain the websocket connection is about 99% identical between 11.x and 13.x.

Yes, there was a fundamental change to use native fetch vs got in v13, theoretically, this might cause some problems, but again, ring-client-api is used across a wide variety of projects with, collectively 10's of thousands of users, the number of reports of such issues, with the exception of HOOBS because it has outdated NodeJS which doesn't support fetch, is near zero.

Of course that is not to say there could not be some problem, but the only way to get to said problem is to collect data. If ring-client-api calls the NodeJS fetch API, and fetch reports that it couldn't get the data, what code can be changed to fix that on our side? If the OS reports that the TCP connection for the websocket is closed, what more can we do?

Also, I'm not just a maintainer of this project, I'm also a user, I have the plugin running on my network with zero issues even after weeks. Without being able to reproduce issues, or users willing to dig in and collect detailed data, there's not much we can do about it, I mean, it's even possible that it could be some issue with the updated push-receiver which we had to use in v13 to work with modern push notifications, but that's an upstream project so we'd need to see evidence of an issue.

owellsjr commented 2 months ago

I have the same issue. I had 12.1.0 installed and upgraded to the latest version 13.1.0. I immediately started having these errors. After much troubleshooting I determined that I get these errors on all non-beta versions higher than 12.1.1. So I've downgraded to that version for now until this issue is resolved.

10/3/2024, 6:55:56 PMRing BridgeRingERRORFailed to reach Ring server at https://oauth.ring.com/oauth/token. fetch is not defined. Trying again in 5 seconds... 10/3/2024, 6:56:01 PMRing BridgeRingERRORFailed to reach Ring server at https://oauth.ring.com/oauth/token. fetch is not defined. Trying again in 5 seconds... 10/3/2024, 6:56:06 PMRing BridgeRingERRORFailed to reach Ring server at https://oauth.ring.com/oauth/token. fetch is not defined. Trying again in 5 seconds... 10/3/2024, 6:56:11 PMRing BridgeRingERRORFailed to reach Ring server at https://oauth.ring.com/oauth/token. fetch is not defined. Trying again in 5 seconds... 10/3/2024, 6:56:16 PMRing BridgeRingERRORFailed to reach Ring server at https://oauth.ring.com/oauth/token. fetch is not defined. Trying again in 5 seconds... 10/3/2024, 6:56:21 PMRing BridgeRingERRORFailed to reach Ring server at https://oauth.ring.com/oauth/token. fetch is not defined. Trying again in 5 seconds... 10/3/2024, 6:56:26 PMRing BridgeRingERRORFailed to reach Ring server at https://oauth.ring.com/oauth/token. fetch is not defined. Trying again in 5 seconds...

tsightler commented 2 months ago

@owellsjr That is not even slightly the same problem, the error message is not the same. Your issue is covered ad-nauseum in probably a dozen duplicate issues here and the solution is noted in the release notes. The 13.x version of the plugin require NodeJS v18 or later (ideally v20) and yours is older than that so the plugin will not work. You will have to update your NodeJS to a supported version (note that, technically NodeJS versions <18 are not officially support by Homebridge any longer and have been out of support for some time).

In 24 hours I will mark your reply as offtopic to keep from confusing it with the issue here. Thanks.

benjorito commented 1 month ago

I have the same issue after updating the plugin to 13.1.0 and switching to unbridged cameras, so unfortunately I can´t say which of these two changes caused the errors.

Would it help if I turn on the debug logging and provide the logs, let´s say tomorrow?

[10/5/2024, 5:42:37 AM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds... [10/5/2024, 5:42:52 AM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds... [10/5/2024, 5:43:07 AM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds... [10/5/2024, 5:43:22 AM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds... [10/5/2024, 5:43:28 AM] [Ring] Failed to reach Ring server at https://prd-api-us.prd.rings.solutions/api/v1/mode/location/87c291a5-2614-4782-6a1b-2bf0e1990ea3. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:09:14 AM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:09:29 AM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:09:33 AM] [Ring] Failed to reach Ring server at https://prd-api-us.prd.rings.solutions/api/v1/mode/location/87c291a5-2614-4782-6a1b-2bf0e1990ea3. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:09:44 AM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:09:48 AM] [Ring] Failed to reach Ring server at https://prd-api-us.prd.rings.solutions/api/v1/mode/location/87c291a5-2614-4782-6a1b-2bf0e1990ea3. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:09:53 AM] [Ring] Failed to reach Ring server at https://prd-api-us.prd.rings.solutions/api/v1/mode/location/87c291a5-2614-4782-6a1b-2bf0e1990ea3. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:09:59 AM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:10:03 AM] [Ring] Failed to reach Ring server at https://prd-api-us.prd.rings.solutions/api/v1/mode/location/87c291a5-2614-4782-6a1b-2bf0e1990ea3. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:10:14 AM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:10:18 AM] [Ring] Failed to reach Ring server at https://prd-api-us.prd.rings.solutions/api/v1/mode/location/87c291a5-2614-4782-6a1b-2bf0e1990ea3. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:10:29 AM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:10:33 AM] [Ring] Failed to reach Ring server at https://prd-api-us.prd.rings.solutions/api/v1/mode/location/87c291a5-2614-4782-6a1b-2bf0e1990ea3. fetch failed. Trying again in 5 seconds... [10/5/2024, 6:10:35 AM] [Ring] Failed to reach Ring server at https://prd-api-us.prd.rings.solutions/api/v1/mode/location/87c291a5-2614-4782-6a1b-2bf0e1990ea3. fetch failed. Trying again in 5 seconds...

tsightler commented 1 month ago

@benjorito What actual functional problem do you have? Occasional failures to Ring API are to be expected as long as they are not continuously reported every minute there's really nothing unexpected here. There was nothing logged from 5:43AM until 6:09AM which implies that this is just normal random failures as there would be plenty of successful requests in the interim.

tsightler commented 1 month ago

@markcarroll I spent a bit of time going through your logs as well as looking at my own setup. I'm pretty convinced that, whatever issue you are having, it is not related to the log messages, I believe the messages in the logs, at least for your case, are red herrings.

First, it's important to note why there are regular calls to https://api.ring.com/clients_api/ring_devices at all. The reason for this is that devices such as cameras, chimes and intercoms are what I call "polled" devices, the only way to get their current state of things like whether the light is on/off, battery status, or siren status, etc, is to call the API.

This API is not really designed for frequent polling, the Ring apps poll it no more often than every 60 seconds and, in most cases, will only poll it on an interactive bases, such as opening the settings of a camera/chime/intercom or performing a manual refresh of such a page. For use in ring-client-api we need to keep device states updated so the API is polled every 20 seconds and returns the current state for all camera/chime/intercom devices. Note that this is not used at all for alarms devices/sensors, or for motion/ding notifications, it is only for other states on these devices. Because this API is not intended for such frequent polling, occassional failures are expected, and have no real side effects except that the state is updated later.

I've noticed that some people love to set camera polling to far faster intervals, but this will just lead to more messages as the state is only updated on the backend every 30 seconds or so anyway, it's certainly not immediate.

The point is, these messages are expected to occur sometimes and, unless you are seeing them every 15 seconds (which shows even the retries fail), they are not an issue. Also note that there are many other API calls made as part of ring-client-api, such as when a push notifications is received, and those are not failing, based on the logs (if they failed, they would also be logged).

You never really stated what issue you are having, you state messages about arming/disarming works, but then you say "any interaction" will fail, and associate that with these failed fetch messages, but these polled APIs are not even used for alarm devices.

So I guess what I'd like to ask is, what actual problem are you having? Can you please provide logs when you attempt to something such as arming/disarming? It's just not possible for me to see how arming/disarming could have anything to do with the API log messages at the moment since alarm devices use the websocket. The logs even show that camera push notifications and snapshot updates are still working. There's really nothing to indicate that the plugin is not working.

I had one similar report with ring-mqtt, where two different users complained of random failures and high CPU or unexpected restarts. The issue turned out to be related to lack of memory (it seems NodeJS 20 uses a bit more RAM than prior versions), simply adding an extra 1GB of RAM to the VM caused the issue to go away. In another case, a user was having random failures communicating the API, and this turned out to be caused by some snort rules on his OPNsense firewall.

pcziesch commented 1 month ago

same here failed to connect the server with 13.1.0 last week it runs without any issues don't know what to do

[10/10/2024, 2:05:15 PM] [Ring] Failed to reach Ring server at https://prd-api-us.prd.rings.solutions/api/v1/clap/tickets?locationID=xxx. fetch failed. Trying again in 5 seconds... [10/10/2024, 2:05:20 PM] [Ring] Failed to reach Ring server at https://api.ring.com/clients_api/ring_devices. fetch failed. Trying again in 5 seconds...

tsightler commented 1 month ago

@pcziesch Again I will ask, what functional problem are you having, having some API failures message is not a problem, they happen sometimes and do not indicate any type of issue unless they are happening every 15 seconds. Are you seeing these every 15 seconds?

pcziesch commented 1 month ago

@tsightler thanks for comment , my issue is that all my ring devices are not seen anymore in homebridge plugin which wasn't any issue at all in the past. and to be honest i didn't have a clue why just update from time to time my plugins but nothing else. the last update wwas node from 20.17 to 18. i am using homebridge via docker on a NAS from qnap since some years without any issue until now. rebootet my devices nas, homebridge etc several times but issue still occur. login details and link renewed for ring already

tsightler commented 1 month ago

@pcziesch That would need a new issue with full logs from startup as that is not the same as the issue reported here where the system works for some period of time after restart (maybe only minutes or hours) and then fails.

It sound like in your case it never works. I'd prefer not to mix such issues here.

pcziesch commented 1 month ago

@tsightler yes maybe you're right . i don't get it work again. so should i post complete log after start here ?

tsightler commented 1 month ago

@pcziesch I would prefer if you open a new issue and provide full requested details, including full logs during startup. I will mark your posts here as offtopic so as not to confuse them with the original issue.

cknova1979 commented 1 month ago

I'm having exact same problem. Everything works fine in version 12.1.1 but once I upgrade to 13 cameras no longer load, can't change alarm, etc..

tsightler commented 1 month ago

@cknova1979 That does not sound like the same problem as reported here. Again, the issue reported here is that it works but stops working after some minutes/hours. If your setup never works, that is an entirely different problem, mostly likely already covered in one of the zillions of other threads.

cknova1979 commented 1 month ago

@tsightler Thanks just found it and your posts. Looks like a Hoobs node issue as you stated. If there is a way to pin some of the bigger issues it would probably reduce all the redundancy.

tsightler commented 1 month ago

Can someone who is experiencing this issue please enable debug in the plugin configuration and post logs from that? It should provide a complete stack trace of the failure.

markcarroll commented 4 weeks ago

@tsightler sorry it took so long for me to reply. The issue I am having is that after a few hours my HomeBridge instance cranks up to near 100% CPU and 100% RAM usage. I am relatively sure it is the Ring plugin at this point since resetting the Ring child bridge causes everything to drop back down to normal levels again. During this time when it is maxed out, nothing works.

I only started noticing this around the time of the v12 update although I cannot be sure exactly when.

The effect is that once I restart HomeBridge, or the docker container (Which is often what i have to do since HomeBridge UI wont respond) everything goes back to normal. I have noticed that it seems to get worse once I interact with the Ring alarm. So, for example, things are working well all day. I turn on my scene at bedtime that switches the alarm to Home mode, which works. However when I get up in the morning and try to turn off the alarm it will not respond and I open the HB UI to find CPU and memory has shot up to near max. Same if I restart it at night and then use it to successfully turn off the alarm in the morning. It will most likely not be ok by the time I come to use it in that next evening.

tsightler commented 3 weeks ago

@markcarroll Thanks for your response. Can you please provide some additional details about your deployment. How is Homebridge deployed. What is the host that it is running on and how many resources does it provide. How many Ring devices/cameras do you have?

The code in question is in use in more than 10,000 installs with only a handful of such reports so I'm trying to understand what might be unique that is triggering the behavior as, overall, there are very few differences between 12 and 13. The major version jump was primarily due to a breaking API change with notifications, not due to major changes in the code.

markcarroll commented 3 weeks ago

It is running on docker on a raspberry Pi. Not much else on there right now as I tried to isolate the instance to debug it a bit. There had also been no changes to the device between it working great and the current situation, so reluctant to change too much there.

I have

LMK what else you need

tsightler commented 3 weeks ago

What model RPi? How much RAM? Which RPi OS version? 32 or 64-bit? Which Docker image are you using?

Your Ring setup is reasonably similar to mine. I have an alarm with slightly less sensors, with a Chime and 8 total cameras that are a mix of Floodlight Pro cams, stickup cams, and a wired Doorbell Pro (I have no battery cams).

I'm running a test setup on a RPi3, so only 1GB, and it all seems to work fine for weeks with no issue using latest Raspberry Pi OS 12, 64-bit version, with homebridge standard docker image. This is why it is so difficult to understand what might even be the issue. Without being able to reproduce it, there's very little to go on.

markcarroll commented 3 weeks ago
Model : Raspberry Pi 4 Model B Rev 1.4
RAM   : 4GB
OS    : 64-bit
Image : https://hub.docker.com/r/homebridge/homebridge/

This is why it is so odd that it is eating up all the RAM.

tsightler commented 3 weeks ago

OK, I have that exact model in the closet, so I'll pull it out and set it up. Not sure that it will help, but worth a spin I guess. I'm quite honestly not sure what to even try, there's very little things that it can be, IMO. I wish I could somehow get access to a misbehaving system.

markcarroll commented 3 weeks ago

Appreciate the effort, but before you spend any more time on it, i think this weekend I might tear it down and set up a fresh instance. This build has been running through many updates and plugins so starting afresh might help clear out any anomalies.

tsightler commented 3 weeks ago

OK. I would be interested in what OS version specifically, and what Docker version. My setup is running latest RPi OS version based on Debian 12, and latest Docker community edition (not stock from repo).

Other interesting questions:

Are you using wifi or ethernet connection? Static or dynamic IP? What is your local DNS? Any kind of DNS proxy, like PiHole? What firewall, anything with app level filtering?

There has been at least one case where a firewall was somehow triggering on the requests, so they would work for a bit and then be blocked. That user had to add an exception to the intrusion prevention rules on their firewall.

tsightler commented 1 week ago

For anyone experiencing this issue, please test 13.2.1-beta.0 and report. I'm going to close this issue for now, but you can follow https://github.com/dgreif/ring/issues/1506 as I believe the root cause is roughly the same.