Open agittins opened 4 days ago
It would also be great if the diagnostics could include an n-most-recent list of each of these sensors too.
Either by storing them into something in Bermuda or by having the diagnostics do a query against recorder for the data.
The goal here is that someone can work out their own issues between the exposed entities and the docs, or they can share screenshots, or they can upload a diagnostics.
At some point I'd like to write a bot/workflow for issues that will parse out useful information when a diags is included in the ticket, which should make triage quicker. (Raised #365 )
What do you think about a binary sensor to go along with this data? Either one to encompass all of it or one or each entity.
Basically healthy or unhealthy.
That way the user can immediately see - Oh something is wrong with this proxy, let me dive in and figure out what.
Maybe also a FAQ for each entity on troubleshooting steps. i.e. If Proxy Avg Update Interval > 3 then try:
Yeah, great idea! I was thinking about using the "Repairs" feature, which allows using URIs to provide solutions. I don't know how annoying that might get for this sort of thing, but we could have a Button entity to "Check for Repairs" so that it only created Repairs when the user asks for help, perhaps prompted by the Health indicator going off.
But yes a simple binary that makes an easy automation target might be a great idea.
From message from Lash-L — Today at 2:13 PM
I'm dead keen on this, but haven't worked out how/what exactly yet, so let's try now.... DRAFT FOR DISCUSSION
These would all probably be on a 1 minute or longer update cycle.
[ ] Proxy Avg Update Interval (or peak avg update interval?)
[ ] Proxy Reporting Stats
stale_updates
count, to instead feed ahist_interval_updates[ fresh, stale]
list. We increment fresh or stale on each update cycle, depending on whether the proxy has given us any new data. If update is fresh and stale count is not zero, we first insert a new tuple in the list. Then our list contains pairs of contiguous fresh/stale update counts.I think we could do something very similar to the proxy stats for devices. It's a little trickier in that "outages" are legitimate for devices, because sometimes they leave home, while proxies aren't expected to. But by trimming the lists based on keeping
sum(fresh)+sum(stale)
below a certain time limit, we get a good "recent stats as of now" measurement, and HA's history of that entity shows it's variation over time - so you'd see your phone performing well, but then doing "poorly" for a few hours because you were out at work, etc.So for devices, we check if any proxy has a fresh update for us and update fresh/stale accordingly.
Something to keep in mind is that proxy entries in the devices{} dict will also be metadevices in future.